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This  docviment  is  the  final  technical  report  for  the  Softv.-are  Quality 
Measurement  Demonstration  projec-.  contract  F30602-8c^C-0132 .  Contract 
work  was  performed  by  Science  Applications  International  Corporation  for 
the  Rome  Air  Development  Cencer  to  pro'^ide  an  eva'nation  of  software 
quality  measurement  guidelines. 
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of  Software  Quai.ty  Attributes,  crrcdvi  .  .  'cei.';.  .Aer''space  Coir.par-^ 

under  contract  FJ^S02-37^^C^n>1.  The  ovipose  or  the  guidebook  was  to 
develop  a  methodology  t^'  enabj<=  a  .oftv/are  acquisition  manager  to 
determine  and  specify  sofewar-  q'sality  tactur  rumens s . 

The  final  technical  rer'crt  consists  of  six  sections,  r>-=‘ceded  by  c.c 
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EXECUTIVE  SUMMARY 


Contract  Purpose 

This  project  was  conducted  by  Science  Applications  International 
Corporation  for  the  Rome  Air  Development  Center.  The  purpose  of  the 
project  was  to  assess  the  feasibility  and  utility  of  transferring 
software  measurement  technology  to  the  acquisition  environment  using 
software  quality  measurement  guidebooks. 

The  guidebooks  were  written  as  a  set  of  directions  for  the  acquisition 
manager.  They  describe  how  the  manager  is  to  specify  software  quality 
goals,  how  he  is  to  assess  compliance  based  on  evaluation  reports,  and 
how  the  evaluation  reports  are  to  be  created.  The  reports  are  generally- 
generated  by  independent  verification  and  validation  personnel,  or  by 
the  developer,  based  on  the  evaluation  of  worksheets.  The  worksheets 
contain  questions  relating  to  each  phase  of  the  software  development 
process  and  are  applied  to  documents  or  to  code. 

The  methodology  chosen  to  accomplish  this  task  was  to  try  it  out  on  tv.:, 
test  software  projects  across  all  phases  of  development  (requirements, 
preliminary  design,  detailed  design,  and  coding).  This  included 
specifying  software  quality  goals  and  evaluating  documents  and  code 
written  for  each  of  the  above  phases. 

Technical  Approach 

The  approach  taken  for  this  project  was  to  follow  the  guidebook 
procedures  as  closely  as  possible,  while  keeping  detailed  records  on 
labor  effort,  quality  evaluation  results,  and  on  any  problems  with  the 
methodology  recommended  in  the  guidebooks.  In  addition,  software 
problem  reports  were  written  against  the  documentation  and  code  whenever 
metric  violations  were  uncovered.  This  allowed  SAIC  to  examine  each 
part  of  the  guidebook  by  actually  performing  each  of  the  steps  and 
procedures  described. 

Results  of  Software  Quedity  Measurement  Methodology  Evaluation 

As  a  result  of  this  process,  SAIC  was  able  to  gather  data  concerning  the 
labor  effort  required  to  perform  the  software  quality  measurement 
process  as  described  in  the  guidebooks.  In  addition,  a  full  quality 
assessment  was  performed  on  each  of  the  two  software  test  projects  under 
examination.  Also  gathered  were  data  concerning  the  difficulty  of 
implementing  and  using  the  steps  and  procedures  described  in  the 
guidebook,  and  comments  on  the  validity  of  the  metric  framework  itself. 

exclusions 

SAIC  has  identified  both  strengths  and  weaknesses  in  the  software 
quality  measurement  methodology  process. 
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•me  strengths  of  the  guidebooks  include: 

•  The  metric  freunework  itself  is  sound  and  reasonable.  The  quality 
scores  determined  for  the  test  projects  using  this  framework 
corresponded  to  the  quality  as  assessed  subjectively  by  project 
personnel. 

•  The  guidebook  methodology  is  divided  into  two  volumes  (one  for 
quality  specification  by  the  acquisition  manager,  and  one  for 
quality  evaluation  by  the  developer  or  independent  verification 
and  validation  personnel).  This  presentation  is  logical  and 
helpful . 

•  Specific  steps  and  procedures  are  included  in  the  guidebook,  and 
these  are  very  helpful  in  understanding  the  entire  measurement 
process.  Some  examples  are  given  which  are  also  helpful. 

•  A  proposed  Data  Item  Description  that  would  allow  reporting  of 
quality  evaluation  results  to  the  acquisition  mainager  or  System 
Program  Office  is  useful. 

In  contrast,  some  weaknesses  were  also  identified.  These  include: 

•  The  guidebook  presentation  mixes  theory,  justification,  examples, 
and  procedures  to  be  followed.  This  is  often  confusing. 

•  Information  presented  to  allow  the  acquisition  manager  to 

understand  the  metric  framework  and  quality  specification  is 

sketchy  and  confusing. 

«  Guidance  supplied  in  specifying  quality  goals  includes  a  weighting 
technique  that  will  make  quality  assessment  results  each  unique, 
and  therefore  not  conparable  across  projects  or  applications. 

•  Metric  element  questions,  used  for  evaluating  document  and 

software  quality,  are  sometimes  very  ambiguous  and  confusing  in 
themselves. 

•  No  methodology  was  suggested  to  support  the  collection  of  the  data 
used  to  euiswer  metric  element  questions. 

•  No  detailed  guidance  was  supplied  to  support  evaluators  when 

problems  or  questions  arise  during  metric  worksheet  scoring. 

Recoamended  Changes 

Based  on  following  the  steps  and  procedures  contained  in  the  software 
quality  measurement  guidebooks,  SAIC  has  recommended  some  changes  be 
incorporated  into  the  guidebooks  themselves.  These  changes  include: 
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•  Add  more  examples  to  the  guidebooks,  but  separate  them  from  the 
steps  and  procedures  shown. 

•  Continue  research  so  that  the  relationship  between  actual  system 
quality  and  the  predicted/measured  system  quality  may  be 
validated. 

•  Provide  training  material  and/ 'or  classes  for  acquisition  and 
evaluation  personnel  to  help  standardize  the  software  quality 
measurement  methodology,  and  to  increase  its  acceptance  and  use  in 
the  Department  of  Defense  as  a  '-.-hole. 

•  Provide  more  information  to  the  acqi'isition  manager  concerning  the 

relationships  and  structure  of  the  nietric  framework,  and  describe 
how  he  might  select  subsets  cf  this  data  to  most  effectively 
ensure  that  a  quality  system  is  developed,  even  if  costs  arc- 

restricted. 

•  Establish  procedures  for  the  handling  of  metric  violations  so  that 
the  information  gathered  from  the  evaluation  process  is  most 
effectively  used, 

•  Create  workbooks  to  support  the  gathering  of  data  needed  for 

metric  evaluation  in  as  efficient  and  effective  way  as  possible. 

•  Provide  answer  sheets  for  metric  evaluation  questions  that  are 

repeated  for  each  unit  in  the  system. 

•  Create  glossaries,  examples,  and  procedures  for  the  use  of  the 

metric  evaluators  to  ensure  that  the  questions  are  interpreted  and 
answered  correctly. 

More  general  recommendations  are: 

•  In  conjunction  with  the  publication  cf  Department  of  Defense 
pamphlets  AFSCP  800-43,  Air  ^rce  Systems  Command  Software 
Management  Indicators,  and  AFSCP  800^4,  Air  Force  Systems  Command 
Software  Quality  Indicators,  the  Air  Force  Systems  Command  sTiould 
sponsor  the  use  o?  the  software  quality  measurement  (SQM) 
guidebooks  on  future  system  acquisitions. 

•  RADC  should  identify  current  on-going  DOD  acquisitions  which  are 
using  all  or  part  of  the  SQM  guidebooks  and  capture  data  on  the 
effectiveness  of  the  guidebooks  in  these  acquisitions. 

•  A  research  effort  should  be  funded  to  make  the  changes  recommended 
above  to  the  guidebooks. 
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1.0  INTROOOCnCN 

This  document  is  the  final  technical  report  produced  for  the  Software 
Quality  Measurement  E)emonstration  project,  performed  for  the  Rome  Air 
IDevelopment  Center  (RADC)  by  Science  Applications  International 
Corporation  (SAIC).  This  section  of  the  report  describes  its  content 
and  organization.  It  also  presents  an  overview  of  the  purpose,  goals, 
and  methodology  used  in  the  project. 

1.1  Document  Description 

This  report  was  produced  for  the  Software  Quality  Measurement 
Demonstration  (SQMD)  project,  conducted  under  contract  F30602-85-C- Cl <7 . 
It  contains  a  description  of  the  project  and  includes  analysis  results 
data  collected,  and  recommendations  made  by  SAIC. 

The  report  is  orgainized  into  the  following  sections: 

•  The  Executive  Summary  is  a  top-level  overview  of  the  : reject 
and  the  final  technical  report. 

•  Section  1  describes  the  document's  purpose  and  organization . 

It  gives  an  overview  of  the  methodology  used  on  the  contract 
and  of  the  projects  under  analysis. 

•  Section  2  is  an  assessment  of  the  measured  quality  of  the 
software  projects  under  exeimination. 

•  Section  3  includes  the  methodology  evaluation  results.  These 
include  the  evaluation  of  trainee  classes,  quality  goal 
specification,  and  the  application  of  metric  worksheets. 

•  Section  4  lists  the  recommendations  and  conclusions  drawn  by 
SAIC  based  on  analysis  results. 

•  Section  5  lists  all  documentation  referenced  in  this  report. 

•  Section  6  lists  all  acronyms  used  in  this  report. 

1 . 2  Project  Purpose 

The  intent  of  the  SQMD  project  was  to  assess  the  feasibility  and  utility 
of  treuisferring  software  measurement  technology  to  the  acquisition 
environment  using  software  quality  measurement  guidebooks.  These 
guidebooks  consist  of  three  volumes:  Specification  of  Software  Quality 
Attributes,  Software  Quality  Specification  Guidebook ,  arid  Software 
Quality  Evaluation  Guidebook. 

Under  contract  F30602-85-C-0132,  SAIC  conducted  an  investigation  into 
the  application  of  software  quality  measurement  (SQM)  technology  to  two 
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test  projects,  and  evaluated  the  technology's  utility  as  a  quantitative 
input  to  software  quality  assurance.  To  accomplish  this,  the  following 
tasks  were  required  in  the  project  statement  of  work: 

•  Develop  a  plan  to  implement  this  study  based  on  the 
methodology  contained  in  the  guidebooks, 

•  Conduct  independent  software  quality  evaluation  and 
validation  to  include  goal  specification,  data  collection, 
worksheet  scoring,  assessment  of  time  logs,  and  measurement 
of  goal  achievement  at  review  points. 

•  Evaluate  the  metric  framework  and  methodology/;  the  clarity 
and  completeness  of  factors,  criteria,  and  metrics;  the  ease 
of  learning,  understanding,  and  applying  metric  technology; 
the  usability  of  the  technology  within  the  acquisition 
process;  the  sc’jndness  of  its  approach;  the  usability  and 
effectiveness  of  the  metric  threshold  and  weighting  approach; 
the  anpropr iateness  of  the  selected  metrics;  and  tne  time 
requr  r -..'i  to  pertotm  the  software  quality  meas’.rement  process. 

1.3  Software  Oraiity  Measuremer'r  Methodolory  Overv'iew 

The  procedures  evaluated  foe  this  project  are  contained  in  the 
three-volume  methodology  juidebooks  produced  for  RADC  under  contract 
Fl060  i  32-C-0137.  Th=  cpjrdpbooks  contain  a  methodology  for  the  software 
' •iti''"'!  managp’'  to  -^pr’y  software  cnuality  specification  and 
',e--,crnent  '•o  the  sottwa'  -  ac  pj' .  j  •- ion  process.  The  guidebooks  discuss 
■  e  /.::"'",wdre  metric  f :  ■iT!e'-’‘^rK ,  /nd  knowledge  of  this  framework  is 
'.'o  •  ar,t  i'.''  '■'•je  cndorctanding  of  this  SAI'’  technical  report.  For 
ic  -■  >  re''opmp’'ded  /porcaph,  see  the  software  q':ality 

■is  .emen  guieb  rv-  j,;,.  'c  1.  '^Cr  2,  and  ‘^og;..:?  in  .Section  5.0 

d"'  le.crt. 

-  ! -;o  t''  ■■  vsed  .  toe  software  quality 

i  ,  ■..■,T.>r-nt  'i'M  mev  no.-i ' ■  ocy  ore  described  bi''  :'fl,'  i/elov.  The  method 

• ■  0 ♦  t-,  guot'ixccks  ,  '"pe  juidebook  desn  1 1rc-s  how  t.he  acquisition 
(uar:-'-’'  .O'  !  .sr'ec:‘'y  •^■•'ftware  quality  goals  old  assess  compliance  to 

hf-  g.-.;!'-;.  secf  nd  T.iidebook  descrit-er  the  software  quality 

j  iia'' )  or.  rhat  .s  -o  i/e  pe’f;  oaed  aid  reported  to  the  acquisition 
Tanaoer , 

SOPIVJRRE  QUALITY  SPECIFICATION.  This  war*  '"i  the  nethcdology  involves 
procedures  for  specifying  quality  requirements.  It  includes  methods  and 
technir^es  for  determining  and  specifying  quality  factor  requirements, 
for  making  quantifiable  tradeoffs  among  quality  factors,  for  relating 
quality  levels  to  cost  over  the  software  life  cycle,  and  for  analyzing 
quality  measurement  data. 

The  steps  to  be  followed  by  the  software  acquisition  manager  as  he 
specifies  quality  goals  include; 


•  Select  and  specify  quality  factors.  This  procedure  consists 

of  identifying  system  functions,  assigning  quality  factors 
and  goals,  considering  factor  interrelationships ,  and 

considering  costs. 

•  Select  and  specify  quality  criteria.  This  procedure  consists 
of  selecting  criteria,  assigning  weighting  formulas,  and 
considering  interrelationships. 

•  Select  and  qualify  metrics.  This  procedure  consists  of 

identifying  metrics,  selecting  metric  elements,  and 

qualifying  metric  elements. 

orring  the  development  of  the  system,  the  acquisition  manager  assesses 
[uality  compliance.  Based  cn  evaluation  reports,  he  performs  this 
irocess  near  the  end  of  each  software  development  pnase  jUSt  prior  *r 
ormal  review.  The  purc'ose  of  the  process  is  to  assess  compliance  o~ 
ievelooment  oroducts  with  software  rrualitv  factor  i  e^oui  rements  containf. 


1.4 


SAIC  ‘Technical  Approach 


SAIC's  approach  to  the  SQMD  project  was  to  apply  the  measurement 
technology  as  described  in  the  guidebooks  while  maintaining  extensive 
records  on  all  aspects  of  the  project.  In  addition,  SAIC  collected  data 
on  the  training  and  experiences  of  both  inexperienced  and  experienced 
team  members  concerning  the  application  of  software  metric  technology. 
This  approach  is  illustrated  in  Figure  1.4-1. 

SAIC's  quantitative  approach  to  data  collection  and  evaluation  was  based 
on  using  forms,  score  sheets,  worksheets,  and  analysis  reports  to 
document  and  record  project  data.  This  data  included  tasks 
accomplished,  labor  effort  required,  results,  and  any  problems 
encountered  during  the  measurement  process. 

Data  was  collected  concerning  any  difficulties  encountered  while 
implementing  the  software  quality  measuf'-'ment  methodology  using 
.■Methodology  Problem  Peports  (.MPRs),  shown  in  Figure  1.4-2.  Any  metric 
violations  were  recorded  on  Tectmcal  Problem  Reports  (TPRs),  shown  in 
Fir, "ire  1.4-';.  Project  rime  logs  were  used  to  collect  data  to  allow 
a.n.'-.ysis  of  time  rer7i.'i;-ed  to  perform  project  tasks.  Figure  1.4-4 
contains  this  lug. 

Project  tasks  were  .^en  decc.'''.-'csed  into  the  following  steps,  each  of 
which  IS  discussed  oriefly  beicw’: 

9  Train  inexperienceo  team  members 

•  Specify  quality  requirements 

•  EVaiuate  compliance  to  quality  requirements 

•  compliance 
■*  Analy.'e  results 

Tra  'ling  inexperienced  team  members  was  a  formal  instruction  process 
n'oiving  classroom  lectures  and  workbook  exercises.  The  class  was 

or..cinaily  planned  only  for  members  of  the  project  team  who  were  not 
eyperienced  with  metric  tecimolofjy .  However,  more  experienced  team 
memoets  expressed  interest  in  the  training  and  each  decided  to  attend 
class.  Section  3.2  discusses  this  training  further.  Two  junior 
analysts  joined  the  project  after  class  completion.  Their  training  was 
conducted  informally  by  project  leaders. 

Specifying  quality  requirements  was  accomplished  with  a  user 
questionnaire.  The  questionnaire  concerned  desirable  quality  goals,  and 
was  prepared  and  di.stributed  to  the  developers  of  the  test  project 
decision  aids  and  to  SAIC  project  members.  Based  on  tliese  responses  and 
the  SQM  methodology,  goals  for  quality  factors  were  determined,  criteria 
weighted  to  calculate  factor  scores,  and  metric  element  questions 


METHODOLOGY  PROBLEM  REPORT 


TECHNICAL  PROBLEM  REPORT 


ECOAEA  _ ESCMA  Number: 

- Worksheet: _  Analyst: 

-  Date: 

Problem: _ _ _ 


Decision  Aid; _ 

Metric  Element: 
Document: _ 


Impact: 

_  Critical 

_  Moderate 

Test  Recommendations; 

-  Test  To  Validate  Pfob'e 

-  Light 

_  None 

_  Do  Not  Test 

TPR  RESPONSE 


Par  Analyst; _ 

Date; _ 

Validity 

_  Valid 

-  --  Invalid 

Reason  _ _ _ _ _ 


Significance; 

Critical 

Moderate 

_  Light 

_  None 

Comments: _ 


Probable  Action; 

Comment  Before  Ne^i  Phaie 
-  Correct  But  No  Set  Time 
_  Request  Waiver  and  Not  Correct 
Ignore 


selected.  Section  3.3  discusses  this  process  in  further  detail. 

Evaluation  of  compliance  to  quality  requirements  used  the  worksheets 
shown  in  Volume  III  of  the  guidebook.  Quality  measurements  were  taken 
for  each  life  cycle  phase  for  both  test  project  decision  aids.  In  the 
ordinary  acquisition  process,  only  data  selected  during  quality  goal 
specification  would  have  been  collected.  Due  to  the  research  nature  of 
this  effort,  however,  all  metric  questions  were  evaluated  and  scored. 
The  result  was  the  creation  of  two  sets  of  quality  compliance  scores: 
one  using  only  those  factors,  criteria,  aind  metrics  that  were  selected 
and  applicable;  and  one  using  all  metric  element  questions.  Pairs  of 
scores  were  calculated  for  each  test  project  decision  aid  for  each 
software  phase.  Sections  3.4  and  3.5  contain  more  details  on  this 
process. 

Compliance  assessment  was  made  based  on  the  quality  goals  specified  fo 
the  decision  aids.  Comparisons  were  made  between  achieved  scores  and 
project  goals.  Each  metric  violation  was  discussed  with  pro jeer 

developers  to  assess  its  validity  and  its  potential  impact  on  lii“ 
decision  aid  systems.  Analysis  was  done  on  the  achieved  system  quality 
as  compared  to  the  predicted  quality  assessed  at  each  phase  of  the  life 
cycle.  Section  2.0  presents  the  results  of  this  assessment. 

Based  on  the  data  collected  and  all  tasks  performed,  SAIC  analyzed  the 
results  to  indicate  decision  aid  quality  and  to  evaluate  the  metric 
methodology.  Sections  2.0  and  3.0  contain  the  results  of  this  effort, 
and  Section  4.0  presents  recommendations  and  conclusions  drawn  by  SAIC. 

1.5  Decision  Aid  Overview 

The  decision  aids  were  developed  in  order  to  apply  decision  aid 
technology  to  selected  tasks  of  Tactical  Air  Battle  Staff  decision 
making.  The  purpose  of  the  aids  was  to  aid  in  planning,  designing, 
demonstrating,  and  assessing  the  operational  utility  and  technical 
feasibility  of  the  applied  technology  for  operational  Tactical  Air  Force 
personnel.  The  aids  were  intended  to  focus  on  crisis  and  wartime 
decision-making  having  the  potential  of  materially  affecting  the  outcome 
of  a  battle. 

To  meet  these  goals,  four  aids  were  developed.  SAIC  has  analyzed  two  of 
these:  the  Enemy  Sortie  Cap2ibility  Measurement  Aid  (ESCMA)  and  the  Enemy 
Course  of  Action  Evaluation  Aid  (ECOAEA). 

The  ESCMA  was  designed  to  perform  quantitative  analysis  to  estimate  an 
enemy's  sortie  generation  capability.  It  was  based  on  models  of  how 
components  of  installations  combine  to  determine  sortie  generation,  and 
it  identifies  combinations  of  circumstances  that  can  affect  this 
capeJaility.  Expected  bad  weather  and  serious  hangar  damage  are  two 
example  circumstances  which  might  combine  to  reduce  enemy  sorties  from  a 
particular  airfield  or  for  a  particular  mission. 


e  ECX3AEA  allows  a  user  to  evaluate  various  hypotheses  on  enemy  courses 
action  based  on  current  intelligence  gathered  about  enemy  forces, 
ather,  supply  needs,  etc.  It  estimates  which  of  the  actions  would  be 
en  as  most  favorable  from  an  enemy  commander's  viewpoint.  The  aid 
ompts  the  user  to  give  si±ijective  inputs  and/or  importance  weights  for 
aluation  factors  deemed  necessary  to  perform  an  analysis  on  the 
obabilitv  of  a  certain  hvoothesis.  This  information  closelv  resembles 


W 


2.0  QUALITY  EVALUATION  RESULTS 


This  section  of  the  report  discusses  the  results  of  assessing  the 
quality  of  the  Enemy  Course  of  Action  Evaluation  Aid  (ECOAEA)  and  the 
Enemy  Sortie  Capability  Measurement  Aid  \ESCMA).  Quality  was  assessed 
for  each  aid  at  the  requirements,  design,  detailed  design,  and  coding 
levels.  Though  measurement  'was  not  performed  concurrently  with  the 
development  process,  the  decision  aid  systems  were  the  best  avni 'able 
test  vehicle  for  the  demonstracion  of  the  software  quality  measurement 
methodology. 

In  general,  measured  quality’  scores  did  not  meet  quality  speci f i cat  ion 
goals  set  at  the  beginrdnc  cL  ti'ns  project.  This  is  due,  we  i^el  evr,  t: 
three  separate  factors. 

The  first  factor  concerns  t;ie  nature  of  the  development  of  the  dec: sic: 
aid  systems.  Both  aids  are  procf-of-concept  systems  that  were 
to  demonstrate  a  capability.  While  the  developer  v.-ished  to  produce  c 
deliver  a  high  quality  sys'.e  .■  aiid  specified  goals  to  that  i-rel,  t,. 
achievement  was  not  likely  tc  be  within  th<-  scop*:'  or  budget  of  toi; 

effort  itself.  The  main  conte:;.  was  to  create  a  wf_r!<inQ  [.vo-iuct ,  ■ ,  ' : 
the  time  and  budget  specified  by  the  pro;jra;’a  office. 

The  second  factor  relates  to  the  software  rpuality  measu- oiii 
methodology.  The  metcic  evaluation  elements  were  not  desigi'.ed  to 
reflect  the  quality  of  decision  aid  developments.  Much  of  the  content 
of  a  decision  aid  or  expert-based  system  lies  in  the  rule  base  used  to 

drive  the  conclusions.  This  rule  base  did  not  lend  itself  to  the 

current  metric  questions,  and  was  not  scored  for  this  assessment.  The 
present  state  of  the  art  allows  algorithms  and  more  standard  data 

structures  to  be  evaluated,  but  does  not  adequately  address  the  ruh 
base  itself. 

The  third  factor  concerns  the  knowledge  of  the  developers  about  software 
quality  assessment  technology.  Software  quality  goal  specification  and 
measurement  were  not  familiar  to  the  engineers  involved  in  the  system 
development  2Uid  in  specifying  quality  goals.  This  lack  of  familiarity 
resulted  in  the  engineers  specifying  goals  that  were  above  those  really 
required  to  guarantee  that  the  system  be  effective  and  successful. 

For  both  aids,  the  quality  results  discussed  below  include  only  the 
scores  achieved  by  evaluating  a  selected  set  of  applicable  metric 
questions.  Because  of  the  research  nature  of  this  effort,  data  was 
collected  for  all  metric  questions.  In  actual  use,  however,  only  those 
elements  applicable  to  the  project  and  selected  at  quality  goal 
specification  would  have  been  evaluated. 

The  differences  between  scores  calculated  using  all  metric  questions  (as 
was  done  for  research  purposes)  and  scores  calculated  using  only 
applicadale  questions  indicates  the  potential  differences  caused  by 
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analyst  scoring  subjectivity.  Figure  2.0-1  illustrates  this  using  the 
quality  factors  evaluated  for  the  ECQAEA  decision  aid  on  Worksheet  1. 
The  lightly  shaded  bars  on  the  figure  represent  scores  calculated  using 
only  the  applicable  metric  elements,  and  the  solid  bars  represent  scores 
calculated  considering  all  elements,  regardless  of  applicability. 

Applicability  is  an  important  attribute  to  consider.  If  such  choices 
are  left  to  the  individual  analyst,  wide  variations  in  scoring  can 
result  purely  from  his  selection  of  how  to  answer  some  questions.  For 
example,  an  analyst  may  decide  to  auiswer  a  particular  question  as  "N/A" 
rather  than  as  "no"  or  "0."  The  metric,  criteria,  and  factor  scores 
will  each  be  higher  if  "N/A"  is  selected  than  if  "no"  or  "0"  is  used. 
Because  of  this  subjectivity  and  potential  variations  among  analysts, 
SAIC  is  recommending  that  procedures  be  established  specifying  when  the 
"N/A"  answer  may  be  used  (see  Section  4.4.7). 

2.1  Quality  Assessment  Results 

If  len-'''!.!.  re.t'ueL  le'-iSiOr.  aid  achieved  the  desired  i-oftvvare  quaxity 
■  ''.t'lr  goals,  .-dguces  /  and  J.1-’  present  scores  for  the  ECQAEA  and 

■rnr  l especc ' ''‘=' '  y .  d'.e  fac*-'!'  INTEQPIT'i  sho'xs  ur  .i.s  "zero"  on  the 

iigures  ueca'  se  it  is  n.j;.  .'pplrcdble  to  either  deci.?.'ori  nd. 

The  fi'Jtres  d'V  show  a  scoring  rend,  nowp’-e:  ,  in  that  qu.ality  seemed  to 
.r.crsvxe  during  t.ne  de\  / '■:  p.c.e.nt  cytlc.  S.nce  the  intasure.r.en :  q.:a’.  ity 

■’.d  n  ■  cate  place  in  patalie.i  vit:h  the  project  development,  we  vennut 
i'*-  ro  th;  ■:  ipcree.se  :  r  -jnality  to  the  lu.e  ci  t  he  software  quality 
rti'-  ;  c  ■'■-■■nr  !r!ei.hc<io.Logy .  ---•  :.>eiteve  th.a'c.  rhe  reafn  ^  ies  partly  in  the 

.A...,  -irnour-:  ■'!  sof'x.n-  development  .iocum  ..'r,  .  t  i  on  a'/ailable 

.  oE  ',-.■  a  .ds  '.'id  '  r-rP.ii  temen' s  ’0'~  .inentf- 1 ,  and  de.si’jn  was 

■.(..e'p.  d  's  '■■;■■lt  the''-?  was  jelatively  ii'-t’.e  matei  iai  upon 

to  (.a.  .i.ii.  ;e  -g'.:?,  ..'  /  .'i  r  •:h«=  ■'•non*- r,  pcel:r.iruy  .'Jesigr;,  and 

•d  ynas' -  Fre  '  ■  op  ■e,  ■  f  ,  and  .il  lowed  a 

'  idC'  ■  '■■  ■  •■■  -.u  a  ;  •■  -t  ret/i'vii- i'.  ■■'' 


liTj  ;  fv'co' 


!  «  ■~Ch  rr.x  ■..hwirlo 


■ .j'  i'  iMrei«w  r  '  <■■■-  .-i  'uthe;  t  nmds  in  -.‘v- 

.  ‘  'e  '  tH'n.h  ••  .p  w  ::  : :  ■  ■oould  lypi'-'dW 

•  ..  ■  r.''.at  ..  w  ■  '.;■  'x  ;.  ./  phases  .•  •■-Icnment  ■••■juld 

•  ■.a■•!t  liiue  threngheut  the  pr  '^je  Th''  •.“.-■ilt  ■.•.•ouh:i  .  a  system  t'f  lew 

inea  .ured  qual.ty.  Wlirle  a  "low'-!"''  ‘  c'.-’  i  ,-i  .  ^  nue  t'n, rough  '  Iw 

phases,  fer  earh  airl,  t.here  was  a  av  f ;  n . .  .rwaf'  wmi  in  the  '.-a  hies 
calculated,  it  is  interesting,  therefcie,  .o  conti  w;  'low  poorer  quality 
requirements  and  desi'jn  documents  resulted  i.i  hi-ah-u  quality  code. 

We  relieve  that  this  was  caused  by  the  size  of  th'-  t-vo  decision  aids, 

and  'ey  the  quality  of  the  people  wh.'o  created  them.  ''mallr-r  systems  rruiy 

be  jcmpietely  understood  by  the  development  tfa.v,  even  though  the 
underlying  assumptions,  goals,  and  design  are  not  fully  documented.  The 


‘  ■)  allow  asS'^'' iment 
^'1?  final  syWev 
:  tualds  in  SCO'. '■;■. 
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.•  •■“Icpmenr  ■••■juld 
a  system  O'f  lew 
T  nue  t'n  rough  '  Iw 
■'end  in  the  '.-a hies 
'lOw  poorer  quality 
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developers  were  able  to  fully  understand  their  tasks,  and  to  perform 
them  even  if  the  documents  were  lacking.  Because  of  their  skill  and 
experience,  therefore,  the  developers  were  able  to  create  a  better 

quality  system  than  one  would  expect  based  only  on  project 
documentation.  This  in  no  way  invalidates  the  software  quality 

measurement  methodology.  In  a  large-scale  acquisition,  it  is  less 
possible  for  skill  and  experience  to  overcome  early  project 
deficiencies.  It  is  vital  that  early  quality  be  high,  and  that  this 
standard  be  maintained  throughout  development  of  these  systems. 

2.2  Technical  Problems  Uncovered 

On  the  requirements  and  preliminary  design  worksheets,  SAIC  created 
technical  problem  reports  whenever  a  metric  violation  was  uncovered. 
Time  and  budget  did  not  permit  the  generation  of  these  reports  for 
violations  uncovered  iuring  detailed  design  and  coding  worksheet 

evaluation. 

;  Ltf  .  I'l  i-r  rjen-'-rau-n^  technical  problem  reports  was  tc  gather 
■  uorirwrtJ.cn  ■.:o,.cerning  hre  developer's  responses  to  the  problems  that 

e.e  discovered.  I;',  oil,  .149  rep'orts  were  written.  Of  these,  105 
joncetncd  qu-rstions  t.nat  v-are  actually  rot  applicable  ‘•o  the  decision 
lid  systems  or  concerned  non-existent  standards,  and  no  developer 
response  was  requested. 

The  t.  .'ioper  responded  to  the  remaining  144  comments,  as  shown  below: 

•  .  .echinical  pi.tL'let'  identified  as  being  both  valid  and  of  a 

-.iiirj.al  nature.  This  prcbiem  concerned  the  general  quality  of 
.he  dO'..  umentation,  and  particularly  the  discrepancy  between  the 
dasig.'od  r...',its  and  the  coded  units.  The  developer  agreed  with  the 
j.rcbic.1  report,  and  identified  the  '.Jirection  as  of  critical 

tc  the  o'/stem.  haintainabiiity,  particular,  was 

.jsscriloQ  as  being  lowered  bp  the  dccurr.rn'.  qjaij.ty. 

a  0  -echrucal  prcblems  were  identified  as  valid  and  as  having  a 

:ncderate  impact  on  the  qisality  of  the  system. 

4.  iC9  problems  'were  identified  as  valid,  but  with  only  slight  impact 
or.  the  quality  cf  the  twc  decxSior.  aids. 

•  24  problems  were  ident.ified  as  valid,  but  of  no  impact  on  the 

quality  of  either  aid.  This  was  due  chiefly  to  the  size  of  the 

two  decision  aid  systems. 

•  5  technical  problems  were  identified  as  invalid,  and  as  having  no 
meaming  with  respect  to  the  quality  of  t.he  decision  aids. 

SAIC  agrees  with  the  evaluation  responses  made  by  the  developer. 


A  sample  of  the  developers  comments  on  the  problems  uncovered  are 
included  below.  Both  comments  were  categorized  as  valid,  and  as  having 
moderate  impact  on  the  quality  of  the  system. 

Concerning  Worksheet  2,  Metrics  AT. 1(3),  AT.2(3),  arxl  AT. 3(1): 

[Questions  concern  auxiliary  storage  space  allocated,  processing  time 
allocated,  and  I/O  channel  time  allocated] 

Providing  more  detailed  documentation  on  the  allocations  [of  time  and 
space]  at  the  outset  might  be  counterproductive  because,  as  the  [SAIC] 
analyst  suggests,  of  the  prototype  nature  of  the  aid  --  and  because  of 
the  comparatively  rich  facilities  (in  particular,  the  VAX  ll,/750'' 
available.  It  could  be  demonstrated  at  the  outset  that  time  and  space 
allocations  will  not  be  a  problem:  even  a  distinctly  sulaopt  mal 

preliminary  design  would  be  adequate  to  prove  that  point.  ...  The  ra,n' 
interest  lies  in  making  good —  not  just  adequate  —  time  and  space 
allocations,  in  particular  wit.h  a  view  to  subsequent  porting  to  : 

microcomputer.  Concurrent  system  design  and  prototyping,  with  each 
activity  informing  the  other,  is  an  excellent  way  cf  achieving  good  ti; 
and  space  allocations  when,  as  in  t.he  present  case,  neithet  time  ii  : 
space  requirements  are  overly  constrained,  and  the  overall  program  is  a 
relatively  small  one.  ...  Knowing  that  one  is  going  to  have  to  formalir 
such  observations  at  some  point  leads  to  a  raising  of  the  consciousnes' 
that  may  indeed  prove  beneficial. 

Concerning  Worksheet  2,  AM. 3(1);  [Recovery  from  computational  failures] 

Many  failures  and  errors  can  be  covered  by  standard  programming 

techniques  and  should  not  have  to  be  addressed  individually  at  the 
program  design  level. 

2.3  Score  V2didation 

The  decision  developers  conducted  testing  to  verify  the  validity  of 
some  of  the  quality  factor  scores.  In  particular,  they  looked  at 
REUSABILITY,  TORTABILITY,  and  the  criterion  anomaly  meuiagement. 

2.3.1  ESCNA 

Limited  time  was  available  to  the  developers  to  review  the  15,000  lines 
of  ESCMA  code.  Since  the  decision  aid  was  actually  developed  by  a 

sub-contractor  (Betac),  PAR  had  no  access  to  object  code  nor  to  a 

compiler.  The  comments  below  should  be  understood  in  the  light  that 
they  are  the  best  analysis  possible,  but  it  was  impossible  to  check  the 
results  presented  in  many  ways  that  would  have  been  possible  if  the 
object  code  or  a  compiler  had  been  available. 

Ancnaly  Management.  All  file  access  within  ESQTA  is  handled  by 
instructions  that  are  built  into  Pascal.  No  special  provisions  for 
file-access  error  hauidling  were  included  in  the  code.  Accordingly,  ail 
such  errors  are  dealt  in  default  ways  determined  by  the  interaction  of 


No  checks  were  found  for  data  errors  from  files.  While  no  xonderflow  or 
overflow  checking  was  done  within  the  Pascal  code,  the  linear 
programming  pacakge  used  with  the  system  does  make  such  checks. 


REUSABILITY.  Several  of  the  sections  of  code  could  be  reused  in  other 
applications.  This  includes  the  linear  programming  package,  ke^aoard 
procedures,  and  screen  procedures.  Hiis  includes  approximately  15%  of 
the  code.  No  other  reusable  code  was  detected. 


2.3.2  ECOAEA 


The  ECQAEA  was  written  in-house  by  the  developers,  euid  for  these  tests 
they  had  full  accessibility  to  the  code  emd  its  modifications. 


Anomaly  Management.  There  are  several  routines  in  the  aid  that  heindle 
missing  data  files.  If  a  file  is  non-existent,  aun  error  routine  is 
called,  ine  routine  prints  a  message,  and  then  terminates  the  process. 
5CCAEA  does  not  check  for  the  validity  of  the  values  it  reads  or  uses. 
For  this  particular  aid,  most  of  the  values  are  originally  primitive 
input  values  which  are  then  aggregated.  As  a  prototype,  ECXlAEA  assumed 
the  input  data  values  to  be  correct,  with  the  assumption  that  any 
produced  values  were  then  also  valid.  All  user  inputs,  other  tham  the 
<BREAK>  key,  are  trapped  and  hcindled  by  the  progreun,  but  there  is  no 
checking  for  overflow  or  underflow. 


REUSABILITY.  There  is  a  coutine  (get_option  in  file  windows. c)  which 
returns  'ser  inputs  at  menu  selections.  It  is  easily  reusable.  This 
ol-ce  of  code  has  been  used  by  the  developer  in  a  variety  of  projects, 
only  minor  modifications.  In  addition,  the  entire  file  (colors. c) 
is  t  generic  version  of  tne  UNIX  o.irsor  package.  Any  software  which 
uses  a  VT-z20  or  compatible  terminal  will  be  able  to  use  the  tools  which 
tnic  tile  provides.  These  tools  include  code  for  moving  the  cursor  to  a 
specific  position,  clearing  a  line,  clearing  a  window,  for  inverse 
video,  and  for  specifying  foreground  and  backgromd  colors. 


PORTAHILIT?'.  The  basic  underlying  software  of  ECCAEA  is  currently  being 
usee  in  anoti'.er  application  by  the  developer.  This  application  involved 
por^^ing  code  from  its  origiral  development  area  (VAX  11/750  under  UNIX, 
32 -bit  machine)  to  an  Intel  310  machine,  running  XENIX  in  a  16-bit 
environment.  The  following  is  a  brief  description  of  the  amount  of  time 
and  effort  spent  in  transporting. 


The  single  biggest  problem  the  developer  had  to  deal  with  was  the  memory 
limitation  and  model  of  the  Intel  machine.  PAR  spent  3-4  days  in 
recompiling  the  pieces  of  the  source  code  with  appropriate  memory 
models.  The  main  problem  encountered  here  was  redefining  the  sizes  of 
the  various  data  structures  used.  The  Intel  machine  is 
byte-addresscible,  meaning  some  fields  (i.e.,  integers  and  floating 
point)  needed  to  be  aligned  on  even  number  addresses.  Additionally, 
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3.0  METHODOLOGY  EVALUATICN  RESULTS 

This  section  of  the  technical  report  describes  in  detail  the  results  of 
analysis  and  project  efforts  concerning  the  software  quality  measurement 
(SQM)  methodology  itself.  The  section  is  organized  to  reflect  the 
metric  application  process,  as  follows; 

•  Section  3.1  contains  results  that  apply  to  the  entire  quality 
specification  cind  evaluation  process. 

•  Section  3.2  describes  in  detail  the  analysis  of  the  class 

SAIC  conducted  to  train  personnel  not  yet  experienced  with 
software  metrics.  It  also  discusses  differences  found 
between  the  analysis  efforts  of  the  experienced  and  th. 
inexperienced  tecims. 

•  Section  3.3  discusses  the  process  of  the  specification  of 

software  quality  goals  by  the  acquisition  manager. 

«  Section  3.4  contains  the  results  of  the  application  of 

metric  element  worksheet  questions  to  tlie  decision  aid 
documentation . 

•  Section  3.5  describes  nretric  scoring  as  performed  on  the 

decision  aids. 

Throughout  discussion  of  these  results,  references  are  made  to  the 
recommendations  and  conclusions  contained  in  Section  4.  For  each 
problem  uncovered,  there  is  a  recommended  solution  in  Section  4. 

3.1  General  Results 

This  section  presents  some  very  generalized  results  of  the  methodology 
evaluation  process.  It  first  discusses  some  overall  observations,  ard 
then  presents  information  on  the  le±)or  effort  required  to  perform  the 
quality  evaluation  on  the  two  decision  aid  systems. 

3.1.1  G«ieral  Observations 

SAIC  followed  each  of  the  steps  of  the  software  quality  measurement 
methodology  presented  in  the  guidebooks.  In  general,  we  found  the 
methodology  to  be  sound.  The  procedures  and  steps  to  be  performed  are 
basically  logical  auid  meaningful,  and  yield  meaningful  results.  On  the 
more  negative  side,  we  found  that  details  were  often  presented  in  a 
confusing  fashion.  The  mixture  of  justification,  theory,  procedures, 
and  examples  did  not  do  full  justice  to  einy  part  of  the  methodology. 
The  metric  evaluation  elements  or  questions  were,  in  particular,  found 
to  be  ambiguous  emd  confusing. 


3.1.2  Methodology  Labor  Effort 

As  each  step  of  the  software  quality  measureinent  methodology  was 
performed,  laixjr  records  were  kept  to  indicate  how  long  each -took.  A 
total  of  852  hours  were  spent  in  specifying  quality  goals  and  evaluating 
quality  for  the  two  decision  aids.  Figure  3.1-1  shows  how  the  effort 
was  distributed  among  the  major  tasks  contained  in  the  guidebook 
methodology  for  quality  goal  specification  and  quality  evaluation. 

The  laJaor  required  to  perform  the  decision  aid  quality  assessment  is 
more  me^lningful  when  considered  against  the  amount  of  material  analyzed. 
Teible  3.1-1  shows  the  size  of  each  of  the  products  analyzed  during  the 
quality  assessment  procedures. 

As  was  expected,  the  most  time-consuming  aspect  of  this  process  is  the 
evaluation  of  worksheets  and  collection  of  metric  data.  Early 
worksheets  (0,  1,  auid  2)  were  applied  only  once,  and  took  relatively 
little  time  to  complete.  The  final  worksheets  (3A  &  3B,  and  4A  &  48), 
applied  tc  each  unit  in  the  system,  were  quite  time-consuming.  The  3A  i 
IB  worksheets  were  faster  to  evaluate  than  were  the  -lA  &  4B  worksheets, 
chiefly  because  not  every  coded  unit  was  included  in  the  design  of 
either  of  the  two  aids. 

We  have  also  correlated  the  labor  effort  required  to  evaluate  the 
worksheets  based  on  criteria  cind  factors.  Tables  3.1-2  and  3.1-3 
orescn*.  the  amount  of  time  required  to  collect  quality  factor  data  for 
eao!.  irksheet,  ar.d  the  tot-"'!  across  all  worksheets  for  each  aid. 
-cju.es  .3.1-2  through  3.1-4  ate  graf4iical  representations  of  the  time 
'equi  -d  to  evaluate  factors  for  Doth  aids  together  (Figure  3.1-2),  for 
alone  'Figure  3.1-3),  and  for  ECQAEA  alone  (Figure  3.1-4). 
v,c  B  presents  the  information  for  the  software  quality  criteria. 

.'hiS  may  mc.-e  meaningful  becau.'^c  n.iteiia  'which  are  applicable  to 
Ti-'o  tea.',  o.ne  factor  counted  itv'ne  than  once  in  collecting  the  factor 
c-v.'iuation  time  data. 

Cp  oeiicve  th-el  the  ciitferences  in  rime  required  between  the  two 
deci-ien  aids  is  chiefly  due  to  the  size  of  the  documents  involved  and 
the  j  v'unt  final  source  code  produced.  Itie  same  tailoring  process  was 
used  on  each  aid,  so  that  there  were  no  differences  in  the  number  of 
r.ev.r  .0  questions  eLncwet'd.  Wh.i3.e  each  A  was  pr.duced  by  a  different 
lortracfor,  they  are  cf  similar  natures  and  similar  measured  quality,  so 
we  believe  no  major  differences  were  lo'croduced  in  tha*"  regard. 

In  addition  to  evaluating  factor  and  criterioi!  limes,  we  also  calculated 
how  long  each  software  unit  took  to  evaluate  during  application  of 
Worksheet  43.  Figure  3.1-5  represents  the  time  it  took  to  evaluate  each 
'unit  in  the  ECOAEA  as  plotted  against  unit  size.  As  expected,  there 
seems  to  be  a  strong  relationship  between  the  size  :£  the  unit  and  how 
long  It  takes  to  ans'wer  questions  concerning  that  unit.  The  points 
below  tbe  treed  Una  represents  units  typically  more  oonplez  than  other 
units.  Oomersely,  those  above  repreeent  lea*  ocaplax  units. 
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Figure  3.1-3.  Factor  Evaluation  Time  for  ESCMA 


3.2  Experienced/Inexperienced  Project  Men±)ers 


As  part  of  the  project,  SAIC  utilized  two  separate  teams  of  analysts. 
Each  team  evaluated  one  of  the  two  decision  aids  used  as  test  projects. 
The  teams  varied  in  experience,  with  one  team  new  to  metric  application 
and  one  team  consisting  of  personnel  who  had  worked  with  metric  elements 
on  other  projects. 

3.2.1  Class  Evaliiation 

Part  of  SAIC's  approach  to  the  project  involved  the  training  of  the 
inexperienced  project  members.  This  training  consisted  of  a  formal 
class  presented  over  a  three-day  period.  The  class  was  intended 

familiarize  students  with  the  metric  framework,  and  with  th':  SQM 
methodology  specifically.  The  class  consisted  of  lecture,/ ques‘ icr, 
sessions,  and  included  outside  "homework"  in  the  form  of  workbooks. 
Figure  3.2-1  presents  the  outline  of  the  class  as  taught. 

Though  the  class  had  originally  been  planned  only  for  those  .lev  o 
software  measurement  technology,  each  of  the  more  exper-ienced  te  rfr: 
members  expressed  interest  in  attending.  Each  experienced  person  had 
worked  on  other  projects  involving  software  metrics,  but  felt  that  hr 
knowledge  was  incomplete.  After  discussion  of  project  goals,  it  was 
decided  that  all  members  should  attend  the  class.  The  presence  ol 
experienced  personnel  gave  the  "newcomers"  the  benefit  of  real-world 

experiences  during  class  discussions.  In  addition,  the  common 
information  and  focus  provided  a  single  unified  starting  point  for  all 
project  tasks.  It  was  also  noted  that  even  though  experienced,  the 
"oldtimers"  did  not  know  everything  discussed  in  the  software  quality 
measurement  methodology  guidebooks.  The  training  provided  them  with  new 
information  to  be  used  on  the  project. 

After  class  completion,  while  each  team  member  was  actively  working  on 
the  specification  of  quality  goals  or  metric  application,  each  attendee 
was  asked  to  evaluate  the  effectiveness  and  quality  of  the  class 

sessions.  Figure  3.2-2  is  the  form  each  student  was  asked  to  use. 

Forms  were  not  filled  out  by  class  attendees  who  subsequently  left  the 
project.  Section  3.2.3  discusses  project  personnel  further. 

Table  3.2-1  presents  the  ratings  given  to  the  class  by  the  students. 
These  ratings  are  on  a  scale  of  1  to  5,  with  5  meaning  most  effective. 
The  ratings  do  not  correlate  directly  with  levels  of  experience  or  areas 
of  expertise.  Comments  did  consistently  show,  however,  that  the 
students  desired  examples  and  practice  with  actual  metric  question 
evaluation  as  part  of  the  class. 

After  training  was  completed,  two  junior  analysts  joined  the  project. 
Both  were  inexperienced  with  software  measurement  methodologies  and  were 
placed  on  the  "inexperienced"  team.  Training  of  these  analysts  was 
conducted  informally  by  other  project  personnel .  The  result  of  the 
informal  training  was  that  while  able  to  conplete  all  tasks,  the  two 
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CLASS  RATING  FORM 


Analyst. 

Date: 


Metric  Experience  Level:  _  Novice 

Computer  Experience  Level:  _ Novice 

Area  of  Expertise:  _  Programmer  _ 


-  Experienced  _  Highly  Experienced 

- Experienced  _  Highly  Expei’ienced 

Analyst  - Data  Entiy  and  Analysis 


Class  Rating  (Scale  of  1  to  5.  where  5  means  most  effective) 


Area 

Understandable 

Quality  Framework 

Factors 

Criteria 

Metrics 

Factors  Criteria  Metrics 

Quality  Goal  Specification 

Metric  Application 

Score  Evaluation 

Ease  of  Questions 
Learning  Answered 


Applicabilitv 


Comments 


FIGURE  3.2-2  CLASS  EVALUATION  FORM 


TABLE  3.2-1  CLASS  EVALUATION  RESULTS 


CLASS 

ELEMENT 


Area 

(  I.  \SS  R  ATINGS' 

F' .enc  Ai'rk 

-  -e  <  Ce'crv  , 


(Joa'  exei'Xa'lea 


STUDENT 

1 


Mcuic  Exp?r!cncc  t*Apcricnccu 


Cor^JpuIcr■  Experience  f  .xpcncnccd 


:  to'eraremer 


STUDENT 

2 

STUDENT 

3 

STUDENT 

4 

Ni'vicc 

Experienced 

Novice 

F.xpcricncod 

Novice 

Exjx'rienced 

Progr.intmcr 

Vnahsl 

Pre  e;  aii'iiiie: 

4.5 

4.5 

5.0 

i 

2  ; 

5,0 

^  .5 

4,0 

3.0 

1  ’■ 

1 

1 

j 

!  u- 

-i 

i 

3,0 

3,0 

junior  analysts  had  a  much  less  complete  idea  of  the  process  of  metric 
specification  and  evaluation.  They  had  more  difficulty  understanding 
the  reasons  behind  some  of  the  steps.  They  also  tended  to  believe  that 
their  own  lack  of  understanding  was  causing  a  problem,  even  if  the 
problem  was  actually  in  the  methodology  itself.  This  was  rarticu : a:  ly 


evident  in  worksheet  evaluation. 


Neither  ana’ 


produced  many 


Methodology  Problem  Reports  when  they  encountered  "difficult  es  ..u'-h  a 
metric  element  question.  Instead,  each  believed  that  any  prcb’et  .  .n 
his  own  lack  of  understanding,  not  in  the  q-jesticn  itself.  :  n  e  •'ner 
analysts  examined  and  evaluated  every  metric  q-aesticn  as  well,  ibi^  d  i 
not  impact  SAIC  analysis  efforts. 

Based  on  SAIC's  experience  with  training  and  with  metric  apnl  i ''3*' '  on 
are  recommending  that  training  'either  in  th^  form  o*-  classve  .  ; 

prepared  materials)  be  made  available  to  both  acquisition  managerf..  u d’ 
to  those  who  will  be  evaluating  metric  compliance.  This  is  dis'^iss^^ 
paragraph  4.2.3. 

3.2.2  Differences  Cau.sea  ly  Experience 

At  the  beginning  of  the  project,  we  e/pected  certain  d.  fteren'-ps  tr  .sh'v- 
up  between  the  more  experienced  and  less  experienced  pi  -jr",  analv- 
We  anticipated  that  the  less  experienced  personnel  would  oeifcrm  . 
specification  and  assessment  tasks  less  quickly  ^han  wor'd  •hr  c,  ■ 
experienced.  The  less  experienced  ‘earn  v-as  expected  to  na."^  nor’ 
questions,  and  to  understand  the  evaluation  process  less  rompletel  ■ . 

These  expected  differences  proved  to  be  correct.  Table  3..1-2  contains 
some  of  the  differences  between  the  experienced  and  inexperienced 
project  personnel  in  performing  software  quality  evaluation  tasks.  The 
more  experienced  project  personnel  were  faster  at  metn?  u.iest'.  '>r. 
evaluation  and  generated  fewer  problem  reports  concern;  no  ■b-> 
methodology  itself. 

The  less  experienced  team  did,  in  addition,  reach  frustration  .eveis 
earlier  than  did  the  more  experienced  team.  When  questions  were 
piarticularly  confusing  or  ambiguous,  or  the  documentation  being  analysed 
particularly  deficient,  the  less  experienced  analysts  found  the 
experience  frustrating.  The  other  team  was  better  able  to  deal  with 
these  problems. 

3-2.3  Project  Personnel 

Figure  3.2-3  presents  both  the  original  organization  of  the  project  team 
and  the  revised  organization  used  throughout  most  of  the  project.  The 
team  evaluating  the  Enemy  Sortie  Capability  Measurement  Aid,  which  is 
written  in  Pascal,  was  selected  because  of  their  experience  with  that 
language.  The  team  evaluating  the  Enemy  Course  of  Action  Evaluation  Aid 
was  selected  because  of  familiarity  with  the  language  it  is  written  in. 
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TABLE  3.2-2  DIF-'FERENCES  FOR  EXPERIENCED 
AND  INEXPERIENCED  PERSONNEL 


k!i 


Some  personnel  changes  took  place  over  the  life  of  the  project.  Ms. 
Cindy  Bowen,  Ms.  Susan  Fenwick,  Ms.  Linda  Mauer,  and  Mr.  Bill  Randall 
left  the  project  before  worksheet  evaluation  began.  Two  junior 

analysts,  Mr.  John  Garcia  and  Ms.  Louise  Wine,  joined  the  project  in 
their  place  and  were  assigned  to  the  inexperienced  project  team. 

3.3  Software  Quality  Requirements  Specification 

The  first  step  in  the  software  quality  measurement  (SQM)  methodology 
involves  the  specification  of  quality  requirements  by  the  acquisition 
manager.  These  requirements  include  those  applicable  to  the  quality 
factors,  the  criteria  to  be  used  and  their  relative  weightings,  and  the 
selection  of  the  applicobie  metric  elements  to  he  evaluated  on  the 
various  worksheets. 

The  methc-dology,  r-sulrs,  and  problems  encountered  during  the 
application  of  quality  goal  spr  c.i  t  ication  are  described  in  this  section. 
The  mate;.';al  is  organized  as  follows: 

•  .'u  ;.3.1  desct  res  the  .f.eps  '.'■''-'owed  -  ce'forming  the 

g:  -;".  opei  i 'icat  on  .vid  .uemc  seieccion  tas  s. 

•  Section  1.3.2  describes  the  results  aiid  prcblems  uncovered  in 
3pecif/,,ng  factor  quality  goals  for  the  decision  aid  systems. 

«-  Section  3.3.3  d.e  scribe.-,  the  process  of  selecting  and 
weighting  th'‘  'rue  ‘  w  ci  rter'ia.  It  in'i'iudes  discussion  of 
the  results  .■st  .nid  .-nd  the  pr.,.blem3  u-n.~ov=- : ed . 

•  Secti:,  1  3.3.4  discusses  .he  silecci-^n  -jC  t'-.e  met:ic  elements 


appi-tah  >■"  to  th'  decision  aids 
-  nd  lint .. .  as  are  :  .e  '  ■  i  k!  e  d . 

3ecr.'  n  -.I.*  anc.ivzcs  the  eosf 


>  '-‘.lor  -.  ’re'w.'.  the' 

,err  ,  ned  i:'-  c;  ■-  .ef 

■.h-'Sn  pa.’-  .ngr  eons  . 

.  rV.M'.lii'ooi  .x^y  oesci.  t ti  on 


5tep;:.  'S  th--  orccess  r. :  w:.! 

•  sele---  and  Sp'.’ci  tv  v  t 

]  '>  nt  1  fy  1’  .'net.:  c.-'c 
A';  .jo'i  rn:a  .  '  t'.'  iocr  r';  n 
c  inter  rt  .  a  t  i  or.r . 
Consider  noses 


all  v.'ot  Ksh'-et  s  .  Results 


irioiude'r  ins: -'ad  ir 
in'  s.'.rovp  1, ed  .'ithin 


■»> .a.  .V  .■■  .*»  .vy •.-.•.'.v.s Ntv.vtNt'vnv's  -Nt a. tv .v."*  -.".s' 

■  •.*  m.  ^  *_"*  ^  icon  iL,*^  V.**  'C.*  %.*  V."  ^  C-**  ’ 


•  Select  and  Specify  Quality  Criteria 

-  Select  criteria 

-  Assign  wei^ting  formulas 

-  Consider  interrelationships 

•  Select  and  Specify  Quality  Metrics 

-  Identify  metrics 

-  Select  and  qualify  metric  elements 

The  result  of  this  process  is  the  specification  of  the  quality  goals  a 
system  is  to  meet.  The  process  is  described  in  detail  in  Volume  II  of 
the  guidebooks  [BOE-21,  in  paragraphs  4.1,  4.2,  and  4.3.  Paragraphs 
below  describe  the  way  SAIC  performed  each  of  these  steps  in  compliance 
with  the  SQM  methodology,  together  with  the  results  obtained. 

3.3.2  Select  and  Specify  Quality  Factors 

To  specify  quality  factors,  SAIC  used  the  quality  goal  survey 
questionnaire  described  in  the  software  quality  measurement  methodology 
guidebooks.  This  survey  is  designed  to  afford  a  means  of  determining 
desireUsle  quality  goals  as  seen  by  the  potential  users  and  acquisition 
managers.  For  the  decision  aid  project,  surveys  were  sent  to  the 
original  system  developer  (PAR)  and  to  the  SAIC  team  members  working  on 
this  effort.  In  addition,  each  was  asked  to  fill  out  a  response  form 
indicating  their  reactions  to  the  validity  and  conduct  of  the  survey. 
The  survey  questionnaire  is  presented  in  i^pendix  A. 

Table  3.3-1  indicates  the  survey  results  for  each  respondee  for  each 
decision  aid.  The  quality  goals  to  be  specified  by  each  user  were  of 
the  form  Excellent,  Good,  Average,  and  Not  Appliccdsle.  These  goals 
correspond  to  numeric  quality  factor  scores  of  0  (Not  Applicable),  .7  to 
.8  (Average),  .8  to  .9  (Good),  and  .9  to  1.0  (Excellent). 


TABLE  3J-1  QUALITY  GOAL  SURVEY  RESULTS 


QUALITY 

ESCMA 

ECOAEA 

FACTOR 

Devdoper 

Pierce 

Hanley 

Developer 

Lincoln 

Wine 

Wotrelli 

EFHCIENCY 

A 

G 

G 

E 

A 

N/A 

A 

INTEGRITY 

N/A 

N/A 

N/A 

N/A 

A 

G 

E 

REUABIUTY 

G 

E 

E 

E 

E 

E 

E 

SURVIVABIUrY 

N/A 

A 

N/A 

N/A 

G 

N/A 

A 

USABnJTTY 

E 

E 

E 

E 

G 

E 

G 

CORRECTNESS 

G 

E 

E 

E 

E 

E 

E 

MAINTAINABEJTY 

G 

G 

E 

E 

G 

G 

A 

VERIFIABIUrY 

E 

G 

G 

E 

G 

E 

G 

EXPANDABILITY 

E 

A 

E 

E 

E 

A 

G 

FLEXIBILITY 

A 

G 

G 

E 

A 

G 

G 

ENTEROPERABIUrY 

A 

N/A 

A 

A 

G 

A 

N/A 

PORTABILITY 

G 

A 

N/A 

E 

G 

N/A 

G 

REUSABUJTY 

A 

A 

A 

E 

N/A 

G 

A 

One  problem  with  this  goal  specification  process  is  that  there  is  no 
relationship  or  quantification  of  the  quality  factors  with  respect  to 
mission  acceptability  or  performance.  A  System  Project  Officer  does  not 
know  if  "good"  RELIABILITY  is  good  enough,  or  if  "excf'llent"  is 
required.  Baseline  values  or  experience  values  would  be  a  valuable 
addition  to  aid  the  SPO  and  provide  him  with  a  basis  for  picking  a 
particular  goal  level. 

Each  decision  aid  is  a  relatively  small  system,  and  was  therefore 
treated  as  a  single  Computer  Software  Configuration  Item  (CSCI).  No 
siibsystems  were  identified  in  the  documentation  for  either  aid,  nor  were 
separate  sets  of  documents  developed  as  they  are  for  multiple  CSCIs. 

The  software  quality  measurement  (SQM)  methodology  directs  that  separate 
quality  goals  are  to  be  developed  for  each  system  function.  When  the 
user  surveys  are  distributed,  however,  no  information  is  contained  on 
the  forms  that  specifies  exactly  what  the  identified  functions  are.  As 
a  result,  it  .s  up  to  eac'u  survey  respondee  to  list  the  functions  as  he 
Jelieves  they  exis*-.  This  can  cause  areat  confu.uon  and  difficulty. 
Section  4.i.9  contains  SAlC's  recoromenoed  solution  lo  this  problem. 

As  an  example  of  the  confusion  that  can  occur,  the  four  functions 
identified  by  the  developer  (PAR  Technology)  for  ESCMA  are: 

•  Driver 

•  Calculation 

•  Sensitivity  Analysis 

•  Report  Generation 

iowever,  rhf^  ESCMA  functional  desciiption  [PAR-4],  describes  the 
functions  as: 

Identify  aieas  of  opei'aticn  and  o  j  rcraf  t/ai  rf  ieids  of 
■j  ntei  est 

•  Jpdate  airfield  and  aircraft  resr-urce  status 

■i  Estab:. ’‘■  n  aircraft  .oort’-e  and  resource  consumption  rates  and 
minimum  ’''■qui -atTi*=nt- - 

•  lOevelop  objective  function  and  constraint  ecruations  for 
optimization 

•  Compute  .Ticixi.TTjm  sortie  rates 

•  Dr?velop  and  document  anertti'  sortie  capability  e  •  .rnate 

Functional  deccrr^osi t i '"'n  is  in  oe.op'  ;i  cub/‘"ctiv^'  process,  and 
typically  results  in  varying  lists  ''>(  what  ''-''nst  i  tutes  the  syst'orn 
functions.  These  discreparici  es  r---ri.ainly  we-e  evident.  in  this 
methodology  .step.  For  ♦'his  reru.;:i,  so ; ^s  ommei'ding  a  ni'^dificd 
approach  to  the  user  questionnaire.  T!;io  aprr'’ah  is  discussed  oi 
Sections  4. .3.1  and  4.3.9,  and  use.,  f  .c,'. •- •n-;..:tl  -r  ■  i  loeci f i cation  only 
wfien  functions  have  been  identifi'^d  and  srccif--'-!  '■  -pialitY  goal 

specification  is  to  take  place.  Otherwise,  cniy  system-wide  goal 


setting  is  used. 


SAIC  did  not  use  the  functional  allocation  given  by  the  developer  or  as 
shown  in  the  documentation  because  we  believe  that  quality  results 
calculated  on  that  basis  are  misleading.  The  subjectivity  of  functional 
decomposition  detracts  from  the  quantitative  results  that  the  software 
measurement  technology  is  attempting  to  build  and  validate.  We 
recommend  that  this  additional  subjectivity  be  eliminated  until  methods 
have  been  developed  for  more  objective  functional  decomposition,  and  for 
the  setting  of  various  goals  among  functions  based  on  sound  reasoning 
and  theoretical  analyses. 

To  analyze  the  ESCMA  and  ECQAEA  data,  SAIC  consolidated  all  of  ♦■hf’ 
various  specified  functional  goals  into  single  systeirt-wide  pr-'ity 
goals.  To  accomplish  this,  the  highest  functional  goal  for  each  tarto’" 
was  taken  as  the  system-wide  goal.  As  an  example,  the  quality  goals  ^oi 
the  factor  USABILITY  were  listed  as  Good,  Excellent,  Excellent  anc 
Average  for  the  four  develcper-identifi'^d  f^inctions.  One  approach 
establishing  a  system-wide  goal  might  be  to  average  th-r'se  funrt’'^nal 
goals  (with  a  result  of  a  goal  of  Good  for  USABIi^ITY)  .  we  tl'in'r  it  o-. 
better  to  have  goals  that  reflect  the  highest  standard  desired,  and  eo 
instead  have  chosen  to  establish  the  goal  as  Excellent.  This  was  chosei' 
because  at  this  level,  the  quality  can  only  be  as  good  as  its  we-kesr 
function. 

Table  3.3-1,  presented  earlier,  lists  the  goals  specified  by  the 
developer  and  by  SAIC  analysts.  SAIC  analysts  were  included  in  the 
survey  for  two  reasons.  The  first  was  to  give  us  experience  in  goal 
setting,  to  better  allow  assessment  of  how  an  acquisition  manager  or 
developer  might  respond  to  the  survey.  The  second  reason  was  to  provide 
more  data  points  for  the  process  of  final  goal  specification.  The 
acquisition  manager  is  likely  to  have  several  survey  responses  in 
consolidate,  and  we  wished  to  duplicate  his  experience.  The  g'.^lc  set 
by  the  SAIC  analysts  were  derived  to  the  best  of  their  knowledge,  but 
are  not  as  meaningful  as  those  that  an  acquisition  manager  would  set 
himself.  The  goals  listed  in  the  table  reflect  the  highest  goal  for  any 
subsystem  for  each  factor,  as  described  above. 

Using  developer  and  SAIC  survey  responses  resulted  in  correlating  three 
separate  sets  of  goals  for  the  ESCMA  aid,  and  four  sets  for  the  ECQAEA 
aid.  Analysis  was  made  of  the  variation  in  results  for  each  factor  for 
both  aids. 

For  the  ESCMA  aid,  the  SAIC  analysts  agreed  exactly  on  eight  factors  out 
of  the  thirteen.  For  four  other  factors,  agreement  was  only  one  rating 
apart  (between  "Not  Applicable"  and  "Average,"  or  between  "Good"  and 
"Excellent,"  for  exanple).  One  factor  was  two  ratings  apart  ("Average" 
and  "Elxcellent"  for  EXPANDABILITY).  Including  the  developer's  responses 
resulted  in  less  agreement.  Only  three  factors  matched  exactly,  eight 
were  one  rating  apart,  and  two  were  two  ratings  apart.  This  data,  as 
well  as  data  for  the  ECOAEA  aid,  are  shown  in  Table  3.3-2. 


.N 


^ 


‘  ^ 


'W 


n 


VARIATION 


NO  VARIATION 

1  LETTER  GRADE 

2  LETTER  GRADES 

3  LETTER  GRADES** 


ESCMA 

SAIC  ALL* 


ECAOAEA 


SAIC 


ALL* 


includes  developer 

**  N/A  is  treated  as  one  letter  grade  bclo\^  average  (A) 


Analv'Sis  indicates  that  the  difierences  in  goal  specification  shifted 
towarr-i  greater  variation  wtien  the  gcal;-  of  the  developer  were  included. 
There  was  greater  agreement  among  5AIC  analysts  th.an  there  was  among 
3Air'  analysts  .wd  th.-  pi'-ject  developers.  Less  variations  were  also 
-hrA-.;:  among  the  'Analysts  fot  the  FSCt'A  than  “'or  the  FCOAEA.  Since 
3A''  C'b  metric  -  ;  i.'Tice  J  team  vaa."  signed  ;  FSTMA.,  these  more 

'o;::.  iLont  re.sults  were  expected.  These  differences  raise  some 

■'■lesLions  dPout  the  rvobjectivity  of  this  methodology,  particularly  in 
jse  oi  the  facto' s  F'FXIBILTTV  and  EF 1  '  FIE  MY .  The  qoai'^  for  these 
tac: differed  by  as  im.ich  as  twe 

'x  ..  SAIL  ana'vstt  rumpleted  curv'-y  for-,  in  ■.  .car  to  collect  data 

an)  pro'/ide  m’.iitird.e  results  '‘cr  ana.vsis  effects,  the  discrepancies 
nj:A  jca.'  spcc'i  fications  a-:  nc‘  mcv'uinqfui  -.s  ■•oth.  They  do  serve 
t;  t ' ''phae ' .  nowe'.'er,  the  rubjuctxie  'la''  *■  j:  tlie  goal  xirvey 

■  r  -.  -  uc  oic'cest.  S/ .1  :s  r‘- commending  tijac  efforts  be  maue  to 

-..'I'..  ,  :s  sub i'i''tive  eii^menr  ■.■.itr;  further  research  to  validate-  scores 
aya  is-.  '  0^1-'- er-d,  -/isible  q;-tlto.  ■  st  Sec. '-lor;  4.2.2'. 

j.d.i  Select  and  Specify  (AJaiity  f  itncin 

This  SQM  proc^’d'ire  inverves  soieirw:;  :/>  ori'eria  to  be  used  in 
measuring  the  q^iaiity  factors.  The  ccj-tf.'ia  for  -ach  applicable  fa-'tor 
are  selected  and  weighted  to  calcuiate  the  desired  ■■alucs. 

Since  the  prototype  aids  do  net  access  data  from  •'-/.tuL  sal  data  bases, 
the  criterion  of  ef fcctivenes.s  cf  -remmun'i' ati  on  fer  the  factor 

EFFICIQJCY  v/as  considered  to  be  inapplicibi'^  o'-' ^  ■a<-o  ih*  ed  as  -“r'l. 

Since  neither  decision  aid  cciTjnuni rates  with  any  -.-xternal  systems,  tlie 


criterion  of  system  compatibility  was  also  weighted  as  zero. 

The  criterion  of  effectiveness  processing  was  considered  to  have  greater 
impact  on  the  factor  EFFICIENCY  than  effectiveness  storage,  since 
storage  is  off-line  and  does  not  appear  to  be  a  limitation.  The 
elimination  of  effectiveness-communication  raised  the  weighting  of  each 
of  the  two  remaining  criteria  from  33%  to  50%.  Considering 
effectiveness  processing  (EP)  to  be  more  important  than  the  storage  (ES) 
criterion  resulted  in  raising  the  weighting  of  EP  to  80%  of  the  total, 
and  the  lowering  of  ES  to  20%.  These  weightings,  while  reflecting  our 
best  estimation  of  what  was  appropriate,  illustrate  the  arbitrary  nature 
of  the  current  criteria  weighting  methodology. 

The  impact  of  weighting  the  criterion  system  compatability  as  zero 
(since  there  is  no  inter-system  communication),  altered  the  weighting  of 
the  remaining  criteria  vd^ich  determine  the  quality  factor  of 
INTEROPERABILITY,  increasing  each  from  20%  to  25%.  All  other  criteria 
weights  remain  unaltered,  since  we  had  no  justification  for  changing 
these  weights.  Table  3.3-3  indicates  the  composition  of  each  of  the 
quality  factors  by  presenting  their  weighting  formulas.  In  the  table, 
some  criteria  weights  are  shown  as  decimal  values,  and  others  as 
fractional  elements.  Decimals  were  used  whenever  possible  because  they 
clearly  represent  the  numerical  values  to  be  calculated.  In  some  cases, 
however,  fractions  do  not  translate  to  finite  decimal  values  (e.g., 
1/3).  For  those  cases,  the  fractions  themselves  are  given  in  the  table. 

Only  one  particular  problem  was  noted  during  criteria  specification. 
The  methods  in  the  guidebook  for  selecting  and  weighting  criteria  are 
arbitrary,  and  without  detailed  justification.  The  acquisition  manager 
has  no  way  to  link  any  assigned  weightings  to  any  real-world  indicators 
or  values.  Since  the  scores  are  so  dependent  on  this  arbitrary 
assignment,  there  is  no  way  for  the  manager  to  know  that  any  calculated 
result  reflects  a  real-world  meaning  (such  as  errors  per  thousand  lines 
of  code ) . 

SAIC  is  recommending  methods  to  help  in  the  correlation  of  calculated 
quality  scores  and  the  actual  quality  of  each  system.  One  method, 
described  in  section  4.3.8,  is  to  eliminate  adjustment  of  criteria 
weighting.  Instead,  procedures  would  be  devised  to  allow  corrective 
efforts  to  focus  on  criteria  of  special  concern.  Section  2  also 
contains  information  on  this  problem,  presenting  data  concerning  the 
validity  of  the  calculated  decision  aid  quality  scores. 

3.3.4  Select  and  Specify  Quality  Netties 

Following  the  elimination  of  non-appli cable  quality  factors  and  criteria 
(and  the  reweighting  of  criteria),  the  non-applicable  individual  metric 
elements  were  eliminated.  The  similarities  between  ESCMA  and  ECQAEA 
meant  that  the  same  weighting  formulas  and  metric  element  questions 
could  be  applied  to  both  aids.  Table  3.3-4  shows  the  questions  which 
were  eliminated  from  each  worksheet  using  this  process. 


WORKSHEET  1 

WORKSHEET  2 

WORKSHEET  3 

WORKSHEET 

AM.l(l) 

AM.6(1'  -  (4) 

AT.3(l)-(2) 

AM.2(2) 

AM.6(1) 

AM.7(1)  -  (3) 

AM. 2(4) 

AM.7(1)  -  (3) 

AU.1(2) 

AT.3(2) 

AT.3(1)  -  (2) 

AT.3(2) 

CL.l(7)-(8) 

CL.1(2)  -  (8) 

CL.2(1) 

CL.1(7)-  (8) 

AU.2(1) 

CL.l(ll) 

CL.2(1) 

CL.2(2) 

CP.l(ll) 

CL.Kl)  -  (12) 

CL.2(4) 

CP.l(ll) 

CL.2(1) 

CL.2(6) 

ES.1(4) 

CL.2(3)  -  (8) 

CS.1(2)  -  (4) 

CL.3(1) 

CP.Ull) 

OP.UIO) 

CS.2(1)  -  (3) 

CP.Ull) 

CS.2(4) 

EP.1(5) 

CS.2(.5) 

CS.2(4)  -  (5) 

ES.1(4) 

DI.1(4) 

ES.1(7) 

DI.1(2) 

DI.l(6)-(9) 

DI.1(4) 

OP.l(I) 

DI.l(6)-(9) 

EP.1(5) 

OP.  1(2) 

EP.2(3) 

EC.l(l) 

SD.2(1) 

MO.  1(9) 

SD.2(2) 

F0.1(l)-(4) 

M0.2(3) 

SD.2(4) 

M0.2(5) 

SD.2(5) 

FS.2(2) 

SD,3(5) 

FS.2(6) 

OP.l(4) 

OP.2(6) 

SI.4(13) 

ID.1(2)  -  (3) 

SI.4(14) 
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RE.l(l) 

RE.  1(3) 
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SD.3(5) 

SS.1(4) 

SY(ALL) 
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In  a  standard  acquisition  process,  the  elimination  of  metric  questions 
would  mean  that  they  were,  of  course,  not  scored.  Because  of  the 
research  nature  of  this  effort,  however,  all  questions  were  evaluated 
and  scored.  This  resulted  in  two  scores  for  quality  assessment:  one 
reflecting  only  those  elements  that  were  not  eliminated,  and  one 
reflecting  all  question  answers. 

Because  individual  metric  questions  are  not  weighted,  and  are  eliminated 
only  on  the  basis  of  criteria  and  factor  elimination,  we  did  not  uncover 
a  problem  with  the  arbitrary  nature  of  scoring  or  removing  questions. 
We  did  have  difficulty,  however,  with  the  weight  that  each  question 
receives.  Section  4.2.2  discusses  this  as  an  area  for  continued 
research.  Ar\  example  of  the  problems  in  this  area  is  in  the  evaluation 
of  the  aid  ECOAFa  dririi.o  the  code  phase.  The  factor  VISIBILITY  received 
a  sucre  of  .98  loi  that  p.hase.  That  entire  score  was  based  on  answering 
me  question,  OP.KIC'.  Al]  othv  cguestions  were  not  applicable  ithere 
.re  tr.r'e  op'erability  .Tue-.-tions  in  ‘rhat  phase). 
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eUnoic;s  that  provided  tie  moso  ■■■■effecti-e  means  of  leacl'iinc!  h:s 
des-i'-jf’  cgiju.lity  ccc.'t.  .lA' I  presents  a  tecomnended  soi''.tion  for  this 
problem  in  Section  4.i.4. 
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Evaluation  Report  and  submitted  to  the  System  Program  Office. 

The  steps  in  the  procedure  for  scoring  quality  aspects  are  to  first 
identify  allocation  relationships,  and  then  to  apply  the  worksheets. 
SAIC's  approach  to  this  process  was  to  follow  the  steps  identified  in 
the  SQM  guidebooks  as  closely  as  possible,  while  keeping  detailed 
records  on  results,  time  spent,  and  material  covered. 

The  allocation  identification  process  described  in  the  guidebooks  did 
not  lend  itself  well  to  this  application.  Discussion  of  this  process  is 
listed  below  in  Section  3.4.1. 

Worksheet  application  was  not  described  in  detail  in  the  methodology. 
SAIC  performed  this  task  by  creating  one  team  to  score  each  dec  ion 
aid.  The  team  scoring  the  ECOAEA  aid  primarily  consisted  of  analysts 
who  were  inexperienced  in  metrics.  The  team  scoring  the  ESCMA  aid 
primarily  consisted  of  analysts  experienced  in  metrics.  The  following 
assunptions  and  techniques  were  used  to  score  the  worksheets: 

•  All  metric  elements  on  each  worksheet  were  scored.  In  a 
standard  application  of  the  methodology,  only  those  metric 
elements  which  are  determined  to  be  applicable  during  quality 
goal  specification  would  be  scored.  Because  of  the  research 
nature  of  this  project,  however,  every  question  was  to  he 
answered  and  associated  data  collected. 

•  Time  spent  for  each  metric  element  was  tracked  on  the  project 
time  log  sheets  shown  earlier.  Time  spent,  amount  of 
material  covered,  and  the  identification  numbers  of  Technical 
Problem  Reports  and  Methodology  Problem  Reports  created  were 
recorded. 

•  Each  analyst  was  assigned  an  arbitrary  section  of  the 
worksheet  to  conplete  for  Worksheets  0,  1,  2,  and  3A  &  3B. 

For  Worksheet  4A  &  4B,  questions  were  allocated  based  on 
analyst  experience.  Some  questions  are  difficult  to  answer 
for  non-programmers.  Section  4.4.1  discusses  this  further. 

•  The  metric  element  questions  were  completed  in  order  as  they 
appeared  on  the  worksheet. 

•  If  the  score  on  a  metric  element  (for  Worksheets  0,  1,  and  2) 
was  neither  "YES"  nor  "1",  a  Technical  Problem  Report  was 
completed.  This  report  describes  a  metric  violation.  For 
Worksheets  3B  and  4B,  no  problem  reports  were  to  be  completed 
—  instead,  we  had  planned  to  produce  the  reports  based  on 
the  scoring  onto  Worksheets  3A  and  4A.  Time  did  not  permit 
the  creation  of  problem  reports  on  those  worksheets.  Since, 
however,  most  questions  are  covered  in  the  early  phases  of 
development,  this  did  not  create  a  problem  for  analysis. 
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•  If  any  questions  or  difficulties  arose  during  metric  element 
evaluation,  a  Methodology  Problem  Report  was  to  be  completed. 

These  steps  and  assumptions  were  used  to  evaluate  all  the  worksheets. 
Since  each  decision  aid  consisted  of  a  single  Computer  Software 
Configuration  Item  (CSCI),  the  differentiation  between  the  system-level 
Worksheet  0  and  the  CSCI-level  Worksheet  1  was  not  large.  Because  of 
this,  no  new  information  would  be  gained  by  evaluating  both  Worksheets  0 
and  1  for  each  aid.  For  research  purposes,  we  decided  to  evaluate 
Worksheet  0  at  the  system  level  for  all  of  the  decision  aids  developed. 
We  used  the  decision  aid  statement  of  work  (SC^'J)  and  the  planning 
document  [PAR-2]  to  make  this  evaluation.  Worksheets  1  through  4A  &  4B 
were  evaluated  for  both  aids.  Table  3.4-1  lists  the  documents  used  foi 
the  worksheets  evaluated  for  each  aid. 

These  documents  were  selected  as  the  best  fit  available  for  the  intent 
of  the  evaluation  process.  This  evaluation  was  done  outside  oi  the 
system  development  process  and  after  test  project  completiori.  ims 
means  that  all  results  were  gathered  after  the  full  implementation  of 
the  aids,  and  therefore  could  not  influence  the  development  process. 

It  was  PAR'S  intention  to  develop  the  decision  aids  much  in  accordant 
with  MIL-STE>-7935. 1-S,  Automated  Data  Systems  Documentation  Standards 
[7935],  but  in  fact  the  documents  and  life  cycle  3b  not  entirely  TblTow 
this  standard.  This  was  due  to  the  prototype  nature  of  the  decision  aid 
project. 

Even  though  the  project  did  not  entirely  correspond  to  the  standards  of 
MIL-STD-7935.1-S,  it  is  important  to  include  the  standard  in  order  tc 
understand  the  developer's  intent.  Documents  required  for  miL-STD-7935 
development  can  be  assigned  to  two  basic  categories:  system  documents 
and  collateral  documents.  System  documents  are  those  engineering 
documents  used  to  define,  build,  and  maintain  the  system.  Collateral 
documents  include  those  that  manage  and  control  the  development  process, 
that  provide  standards,  that  describe  how  to  use  it,  and  report  on 
testing.  The  reason  for  this  decomposition  is  to  provide  a  simplified 
frame-of-reference  for  comparison  to  another  military  standard, 
DOD-STD-2167.  Documents  in  each  category  for  MIL-STD-7935.1-S  are 
listed  below: 

COLLATERAL 


•  User's  Manual 

•  Computer  Operation  Manual 

•  Program  Maintenance  Manual 


SYSTEM 

•  Functional  Description 

•  Data  Requirements  Document 


•  Test  Plan 

•  Test  Analysis  Report 


•  Program  Specification 

•  Data  Base  Specification 
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TABLE  3.4-1  DOCUMENTS  EVALUATED 


WORKSHEET 

AID 

DOCUMENT 

0 

SYSTEM 

INTERIM  TECHNICAL  REPORT. 

SENIOR  BATTLE  STAFF 

DECISION  AIDS.  TASK  1: 

PLANNING  (3-84)  (PAR-2] 

STATEMENT  OF  WORK  FOR  SENIOR 

BATTLE  STAFF  DECISION  AID  (12/82)1  [PAR-3] 

1 

ECOAEA 

ENEMY  COURSE  OF  ACTION  EVALU/KTION  AID 
FUNCTIONAL  DESCRIPTION  AND  DESIGN  PLAN. 
SENIOR  BATTLE  STAFF  DEOSION  AIDS  (7-84)  [PAR-5] 

1 

ESCMA 

ENEMY  SORTIE  CAPABILITY  MEASURffl^NT  AID: 
DESIGN  PLAN  AND  FUNCTIONAL  DESfflUPTION  (8/85) 
[PAR-4] 

2.3 

E3COAEA 

ENEMY  COURSE  OF  ACTION  EVALUATION  AID  FINAL 
FUNCTIONAL  DESCRIPTION  AND  DESIGN  PLAN 
(10/85)  [PAR-6] 

2,3 

ESCMA 

ENEMY  SORTIE  CAPABILITY  MEASUREMENT  AID. 
DESIGN  PLAN  AND  FUNCTIONAL  DESCRIPTION 
(8/85)  (PAR-4] 

4 

ECOAEA 

SOURCE  CODE  LISTINGS 

4 

ESCMA 

SOURCE  CODE  LISTINGS 

•  Systen^/Subsystem  Specification 


A  draft  copy  of  the  DOD-STD-2167  standard.  Defense  System  Software 
Development,  was  the  basis  for  the  guidebook ' s  software  qiiality 
measurement  technology.  In  order  to  make  comparison  to  MIL-STD-7935 . 1-S 
easier,  the  documents  described  in  that  standard  have  also  been  divided 
into  two  types,  system  and  collateral: 

COLLATERAL 


•  Operational  Concept  Doc. 

• 

Computer  System  Operator's 

•  Software  Configuration 

Manual 

Mangement  Plan 

• 

Software  User's  Manual 

•  Software  Quality 

• 

Computer  System  Diagnostic 

Evaluation  Plan 

Manual 

•  Software  Development  Plan 

• 

Software  Programmer's  Manual 

•  Software  Test 

• 

Software  Test  Report 

Description 

• 

Firmware  Support  Manual 

•  Software  Test  Plan 

• 

Version  Description 

•  Computer  Resources  integrated 

Document 

Support  Document 

• 

Software  Tesi  Procedure 

SYSTEM 

•  System/Segment  Spec. 

• 

Interface  Design  Document 

•  Interface  Requirements 

• 

Data  Base  Design  Document 

Specification 

• 

Software  Detailed  Design 

•  Software  Requirements 

Document 

Specification 

• 

Source  Code  (Software 

•  Software  Top  Level  Design 

Product  Specification) 

Document 


Comparison  between  tne  documents  required  for  each  standard  yielded  a 
correspondence  as  shown  in  Figure  3.4-2.  Given  this  framework,  it  was 
necessary  to  analyze  the  existing  and  delivered  system  documentation 
against  both  MIL-STD-7935. 1-S  and  DOD-STD-2167.  This  was  required  to 
understand  the  documents  the  guidebooks  were  designed  to  use,  as  well  as 
the  documents  actually  to  be  used  for  evaluation.  Tables  3.4-2  and 
3.4-3  present  the  documents  available  for  analysis  for  each  decision 
aid.  Figure  3.4-3  is  a  synthesis  of  the  preceding  figures  and  tables 
and  represents  the  documents  available  and  their  relation  to  the 
desirable  set  of  specifications. 

A  later  version  of  the  functional  description  was  used  for  scoring 
Worksheets  2  through  4  for  both  decision  aids.  These  specifications 
became  available  after  the  scoring  of  Worksheet  1  was  completed. 

Analysis  of  the  methodology  problems  uncovered  during  the  scoring  of 
these  worksheets  is  contained  in  the  following  sections. 


FUNCTIONAL 

DESCRIPTION 
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3.4.1  Identify  Allocation  Relationships 

The  ESCMA  and  ECCAEA  decision  aids  are  each  relatively  small  systems, 
and  were  not  deccanposed  into  Computer  Software  Configuration  Items 
(CSCIs)  or  Ctanputer  Software  Conponents  (CSCs)  during  the  development 
process.  As  discussed  in  paragraph  3.3.2,  there  were  also  discrepemcies 
in  the  fxonctions  specified  in  the  documentation  of  each  system  and  as 
listed  by  the  developers  in  the  user  survey  questionnaire.  These 
discrepancies  illustrate  the  subjectivity  inherent  in  functional 
decomposition,  and  the  variations  that  exist  among  the  resulting 
functional  lists.  For  these  reasons,  SAIC  did  not  allocate  separate 
quality  requirements  among  a  list  of  system  functions  for  either  aid. 
Instead,  both  aids  were  given  system-wide  goals  based  on  developer  and 
SAIC  team  responses  to  the  user  survey  questionnaire. 

Because  of  that  selection,  it  was  not  necessary  to  identify  the 
relationship  between  the  specified  functions  and  the  system 
documentation  as  developed.  We  still  examined  the  methodology,  however, 
in  order  to  suggest  improvements  and  assess  its  feasibility. 

We  believe  that  the  identification  process  has  difficulties.  Because 
fxanctional  decomposition  is  subjective  and  arbitrary,  the  allocation  of 
requirements  to  CSCI's  based  on  these  functions  is  also  arbitrary.  The 
assessment  of  the  amount  of  a  particular  CSCI  that  reflects  a  particular 
function  is  also  difficult  and  subjective.  For  these  reasons,  we  are 
recommending  that,  in  general,  no  functional  decomposition  and 
allocation  of  requirements  be  contained  in  the  current  guidebooks.  As 
research  proceeds  and  these  types  of  relationships  are  verified,  the 
procedures  should  be  placed  back  in  the  guidebooks  with  full 
justification  for  their  existence  and  with  definite  steps  and  guidance 
as  to  how  to  perform  this  task. 

For  systems  that  do  merit  functional  decomposition  (as  described  in 
paragraph  4.3.5),  we  recommend  that  CSCIs  be  allocated  to  functions 
(rather  than  the  reverse).  This  means  that  rather  than  trying  to 
determine  how  much  of  a  particular  CSCI  (e.g.,  50%)  relates  to  a 
particular  function,  one  would  determine  whether  or  not  a  CSCI  aids  in 
implementing  a  function  at  all.  If  it  does,  then  the  CSCI  would  be 
counted  100%  in  the  score  created  for  that  function.  As  em  example, 
consider  the  example  of  calculating  optimal  values  as  a  function.  For 
this  function,  any  CSCI  that  performed  any  part  of  the  task  would  be 
considered  in  the  scoring.  One  would  not  take  into  account  that  only 
10%  of  a  given  CSCI  actually  contributed  to  that  particular  function. 

Section  4.3.5  contains  other  recommendations  applicable  to  identifying 
the  allocation  of  functional  requirements  among  the  CSCIs. 

3.4.2  ApplicatiOTi  of  Worksheets 

A  major  aspect  of  the  analysis  of  the  SQM  methodology  is  evaluating  the 
metric  element  questions  that  comprise  the  worksheets.  This  analysis  of 


the  SQM  methodology  metric  element  questions  has  been  done  for  all 
worksheets. 


Several  general  statements  can  be  made  about  the  organization  the 
worksheets.  These  statements  are  explained  in  detail  below,  along  with 
recommendations  for  correcting  deficiencies. 


Each  worksheet  is  organized  wi ‘,h  .no  cuesticr:  ji;  alp; 
metric  mnemonic,  but  the  dats  needed  to  answer  the  n-t 
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undocumented  (for  a  score  of  0;. 

3. 4. 2.1  Metric  Comnent  Categories 

The  worksheet  evaluation  task  involved  answering  327  metric  elements, 
with  850  questions  appearing  on  the  5  worksheets.  Of  the  metiuc 
elements,  153  (47%)  received  problem  comments  of  one  form  oi  auotl'o'. 
Of  the  questions  comprising  these  elements,  in'  (3'’.'  ler-  ived  pioClem 
comments . 

The  types  of  problems  encountered  during  worksheet  evaluation  can  be 
grouped  into  nine  categories,  with  many  of  the  meUi'ss  commented  upon 
for  more  than  one  problem.  These  categories  are  listed  in  earlitt  in 
Table  3.4-4.  The  following  paragraphs  describe  --T-h  ■'[  the  cat'^gories 
and  provide  examples  of  each  problem. 
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CATEGORY 


PROBLEM 


1 

2 

3 

4 

5 


Question  is  inappropriate  for  the  worksheet  level. 

Question  is  confusing. 

Question  generated  a  miscellaneous  comment  (e.g.  decompositioi 
needed,  question  dependent  on  previous  worksheet). 

Question  is  too  subjective. 

Question  requires  an  "all  or  nothing"  response. 


6 


Question  contains  a  typographical  error,  or  some  other 
format  problem. 


7 


Question  is  a  duplicate  of  another  question  on  same 
worksheet. 


8 


Calculating  the  question’s  score  involves  dividing  by 
zero,  or  there  is  some  other  scoring  difficulty. 


9 


Question  should  be  in  a  block  of  questions  that  are 
nested  by  topic. 


Category  1:  The  question  is  inappropriate  for  this  worksheet  level. 
Some  questions  were  found  by  the  analysts  to  be  inappropriate  for  the 
phase  or  level  under  analysis.  For  example,  one  would  not  expect  to  see 
data  flow  diagrams  and  discussion  jf  control  flow  in  aigh-level 
requirements  documents.  The  example  question  below,  however, 

specifically  calls  for  this  ie.f..^i;iration.  The  Data  Item  Desjriptions  foi 
DOD-STD-2167  do  not  call  for  this  informatiop.  at  tl-  requi  rements  i-’vel  , 
The  System/Segment  Speci  f  i.cat  ion  !  DI-F'-'Ab-SOOOfi  ,  f'r  exai;^le,  ^ioes  not 
contain  a  requi i  en.'.nt  for  this  tvpe  data.  It  require,  how.-.vt-r  , 

that  lunctiona:,  ;  lev;  be  shown.  ri  ir  u.  ine-  i  'nr  -  ei 

flow’  as  epposp'-’  to  contro''  /’ovo  i:  -a”--  •  t  '  1 ’■  'o  ■  w”  ' 

requirements  do  urn- n t. ■ .  :  un  ,’se  c- m 

rctommends  rente  inq  ^iien  froni  ■  app’  uulda  r 
this  type  cf  -quest in  i---- 
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ap  ' '.'c .-’-trncpu  Iti't  r’'- -  ''■ttestio’ 
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Foi  '='1:  .t.  of  the.  ■■■'  '■>  s*',:-'  .--e  t  ‘.c  t’'-  ■  ''■t'esti','’ 

eliminated  from  the  . -'.aDprop.- '.are  ■■■•  .v'rs  . 

Category  2:  'nie  questioa  is  confusir-  .  i'n:  -  '-teq-ory  wos  i.idJ  1  fo’ 
more  metric  e  j  p.r.ioj-,t.  -v ; -sr  ■  t-s  '■trar  inu  -ith-’u.  'tru?,  ■  f-- 

■Jo .1  y  8.  s'ore  quest  1  cue  '•oni-’.neti  'erms  Chat  e  eitner  y  ;orl’. 
del  died  i.i  ’  ne  .ib'ssary,  o»  not  detdn.ed  a‘  all.  Otiier  auesLie'is  i'.e 

unrierstf'cJ  tor:-’  b-,-  f'S.  brp-  examrier  w''u,ld  b--  necessarv  tc  ey.plaii';  th- 
con-'-ent  .  .Some  t  '-he  'e-tii.r,  r-’ujd  I’"  !-lam.ed  on  'he  o.l  ottsaries ,  A’'i'rh 
were  s.'.arse  au"  .d-:>'i:a!  f--  h  •  .rk-l'ieet.  'u-rms  iV'-e.u  t.",  ;  ” 

defdie  ,  ui te  1.  erd  i  .  ;  00^;-  w.-r  ■-  .u  .  fut; ' t  :  ano  nyt-m^ 

one  f  o  :  1 1,  '  ’  -u  t  e  •  \e '  r  v^'nri  -e  dr  f  fere.;,  k,---  xt^  to:  'oo 

tf"”  saaip  o^'initL.  An  exanrle  o'  this  tvTiP  -  -'  tirin,  «  r;  e-;-'  : 

terms  a.s  "soecitic  da'.a  storage  and  retr-.e-eal  jcterer  or;. "  ■.tricirai  ,  • 
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Worksheet  3B,  Question  AP.2n'.  Is  the  logisil  prccessiiig  free 
from  specific  data  storage  and  rrtrroval  references  (e.g.,  data 
symbolically  defined  and  referenced'? 

SAIC's  recommended  solution  for  this  problem  is  found  in  ir  iragr ‘tphs 
4.4.6,  4.4.7,  4.4.8,  and  4.2.3.  We  recommend  training  be  supplied  a~  an 
aid  to  standard  worksheet  evaluation,  and  that  a  glossary,  meiric 
examples,  and  standard  procedures  be  developed. 

Category  3:  The  question  generated  a  miscellaneous  connient.  This 
category  includes  questions  that  could  be  decomposed,  that  w’ere 
dependent  on  answers  supplied  in  a  previous  worksheet,  that  should  be 
moved  from  a  unit-level  worksheet  to  a  CSCI-level  worksheet,  and  that 
contained  content  with  which  we  disagreed.  Compound  -questions  ask  one 
question  about  several  different  things  and  become  difficult  to  answer 
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when  not  all  the  answers  are  the  same.  Complex  questions  force  you  to 
answer  several  questions  before  you  have  enough  information  with  which 
to  answer  the  "real"  question.  An  example  of  a  compound  question  is: 

Worksheet  1,  Question  TN.l(l).  Are  there  requirements  to 
provide  lesson  plans  and  training  materials  for  operators,  end 
users,  and  maintainers  of  the  CSCI? 

Some  questions  are  asked  for  each  unit,  but  we  recommend  that  they  only 
be  asked  once  for  the  complete  CSCI.  An  example  of  this  is  question 
AM.2(7)  for  Worksheet  3B.  This  question  implies  that  each  unit  should 
check  for  all  data.  We  recommend  moving  the  question  to  Worksheet  3A, 
and  changing  it  to  read:  "Is  a  check  performed  to  determine  that  the 
data  used  by  all  units  is  available  before  it  is  used  in  processing?" 
The  original  Worksheet  3B  question  reads: 

Worksheet  3B,  Question  AM. 2(7).  Is  a  check  performed  before 
processing  begins  to  determine  that  all  data  is  available? 

Since  we  required  analysts  to  answer  all  questions  (even  if  not 
applicable  to  the  decision  aids),  they  noted  several  questions  which 
were  dependent  on  questions  answered  in  prior  worksheets.  The 
consistency  questions  are  an  excellent  example  of  this  problem.  Early 
questions  concern  the  existence  of  standards,  while  later  worksheet 
questions  relate  to  complying  with  these  standards.  We  believe  it  would 
greatly  help  the  worksheet  tailoring  process  if  lists  of  these  related 
questions  were  made  available  to  ensure  that  they  are  consistently 
tailored  out.  An  example  of  this  relationship  is: 

Worksheet  1,  Question  CS.l(l).  Have  specific  standards  been 
established  for  design  representations  (e.g.,  HIPO  charts,  program 
design  language,  flow  charts,  data  flow  diagrams)? 

Worksheet  2,  Question  CS.l(l).  Are  the  design  representations  in 
the  formats  of  the  established  standard? 

Some  questions  contained  content  with  which  we  disagreed.  An  example  of 
this  is  question  AU.1(2)  on  Worksheet  3A.  This  question  requires 
estimating  source  code  lines  at  the  design  level.  We  believe  this 
question  can  give  misleading  results,  as  the  estimation  of  lines  of 
source  code  is  very  subjective.  The  question  reads: 

Worksheet  3A,  Question  AU.1(2).  How  man  estimated  lines  of 
source  code,  excluding  comments?  ... 

Category  4:  Hie  question  is  too  subjective.  These  questions  contain 
qualitative  adverbs  like  "minimally",  "typically",  "completely", 
"clearly",  and  "precisely".  The  answers  to  these  questions  depend 
totally  on  the  analyst's  subjectivity,  and  may  not  be  an  accurate 
reflection  of  the  software  being  evaluated.  The  subiective  quality  of 
these  questions  also  limits  the  value  of  comparing  scores  for  products 


that  were  measured  by  different  analysts.  An  example  of  this  type  of 
question  is: 


Worksheet  2,  Question  AU.l(l).  Are  all  processes  and  functions 
partitioned  to  be  logically  complete  and  self  contained  so  as  to 
minimize  interface  complexity? 

By  supplying  guidance  and  procedures  for  these  types  of  question,  some 
of  the  subjectivity  can  be  removed.  In  addition,  we  recommend  that 
certain  questions  be  answered  by  mote  experienced  analysts  —  these 
subjective  questions  fall  into  this  category.  These  recominendations  are 
discussed  in  paragraphs  4.4.1,  4.4.6,  4.4.7,  and  4.4.8.  Such  quesLuns 
as  these  are  difficult  to  automate,  u.nlike  more  countable  elements  like 
lines  of  source  code.  It  is  not  the  goal  of  this  process,  howevei  ,  t^ 
eliminate  experienced  and  competent  experts  from  the  evaluation  pior.ess. 
Rather,  the  focus  is  to  automate  questions  that  do  not  require  a  groao 
deal  of  judgment,  and  supply  analysts  with  the  information  noeced  L' 
answer  the  remaining  questions  ir,  an  effective  and  efi-icient  maruier. 

Category  5:  Ttie  question  requires  an  "all  or  nothing”  response,  i.iii- 
category  consists  of  cjuestions  that  have  the  word  ''ail''  in  them.  The 
question  is  both  difficult  to  answer  and  potentially  misleading.  The  u.s-= 
of  "all"  leaves  the  scope  of  the  question  up  to  the  particular  arialys*. 
answering  the  question.  For  example,  if  the  question  concerns  "ail 
hardware  errors",  can  the  analyst  be  satisfied  with  answering  the 
question  regarding  all  the  hardware  errors  mentioned  i..  the 
documentation,  or  must  t!ie  analyst  consider  all  the  hardware  errors  that 
could  possibly  occur?  The  answers  to  these  questions  may  not  accurately 
reflect  the  true  state  of  the  product/document,  because  the  instance 
where  only  1  out  of  1000  fails  is  scored  the  same  way  as  the  instance 
where  999  out  of  1000  fail.  An  example  of  this  kind  of  question  ^s: 

Worksheet  1,  Question  CP.lii).  Are  all  inpi.cs,  proceoSx.ng,  ai»d 
outputs  clearly  <i:id  precisely  defined? 

In  general,  5AIC  recommends  that  these  questions  be  retained  as  written 
for  now.  This  is  discussed  in  paragraph  4.4.10.  Procedures  can  also 
help  solve  this  problem,  as  shown  in  paragraph  4.4.7.  The  data 
collection  workbook  described  in  paragraph  4.4.1  could  also  be  used  as 
a  partial  solution  to  this  difficulty.  The  workbook  could  be  used  to 
record  the  analyst's  assessment  of  the  severity  of  an  "all  or  nothing" 
failure.  The  analyst  would  need  to  be  fully  qualified  to  make  such 
judgments  (i.e.,  of  a  certain  experience  level),  and  this  is  also 
discussed  in  paragraph  4.4.1. 

Category  6:  Hie  question  contains  a  typographical  error,  or  some  other 
format  problem.  Some  questions  contain  typographical  errors,  or  have  a 
problem  with  the  format  in  which  they  are  presented.  While  this 
category  does  not  cause  major  problems  in  general,  is  an  area  that 
needs  correction.  As  engineers  at  PAR  lechnolegy  nr>re.j,  the  software 
quality  measurement  process  should  itself  be  of  high  quality.  These 
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sorts  of  minor  errors  may  cause  the  software  community  to  perceive 
software  quality  measurement  in  an  unfavorable  light.  An  example  of 
this  sort  of  problem  is: 

Worksheet  3A,  Question  SI.6(1),  Part  d.  How  many  unique  operations? 


The  word  "operations"  should  read  "operators."  This  and  similar 
problems  are  discussed  with  recommended  solutions  in  paragraph  4.4.3. 

Category  7:  The  question  is  a  duplicate  of  another  question  on  the  same 
worksheet.  Some  questions  appear  on  the  worksheet,  with  applicability 
to  different  criteria,  more  than  once.  T.his  is  not  an  error  and  dc'^s 
not  imply  that  ther.e  ts  S'-imerhing  wrong  with  the  methodology,  but  ii  is 
an  extremely  inefficient  data  collection  technique.  Examples  of  this 
are  sho’wn  by: 

Wor  ■■.sheef  1,  i  _ .  i '  ■  ,  .  ,M' '‘heie  rerpci  rements  for  a  programming 
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Paragraphs  4.4.7  contains  SAIC's  proposed  solution  to  this  problem, 
vdiich  is  to  provide  guidance  for  the  analyst  as  to  how  to  score  a 
question  in  these  situations. 


Category  9:  'flie  question  should  be  in  a  block  that  is  nested  by  topic. 
For  these  questions,  an  answer  to  an  earlier  question  can  eliminate  the 
need  to  answer  subsequent  questions.  In  the  example  below,  a  "no"  or 
"N/A"  answer  to  question  CL. 1(2)  means  that  the  subsequent  questions 
should  also  receive  a  "no"  or  "N/A"  response.  There  should  be  no  need 
to  spend  extra  time  re-evaluating  the  subsequent  questions. 

Worksheet  1,  Question  CL. 1(2).  Is  there  a  requirement  for  a 
protocol  standard  to  control  all  network  communications? 

Question  CL. 1(3).  Is  the  network  processing  control  part  of  the 
network  protocol  standard? 

Question  CL. 1(4).  Is  user  session  control  part  of  the  network 
protocol  standard? 

Question  CL. 1(5).  Is  communication  routing  part  of  the  network 
protocol  standard? 

Question  CL. 1(6).  Is  uniform  message  handling  (e.g. 
synchronization,  message  decoding)  part  of  the  network  protocol 
standard? 


The  use  of  a  formal  data  collection  workbook,  described  in  paragraph 
4.4.1,  is  SAIC's  recommended  solution  to  this  type  of  problem. 

3. 4. 2. 2  Distribution  of  CcaDaments 

This  section  deals  with  the  number  of  metric  elements  reported  as  having 
problems  and  the  relationship  to  the  number  of  elements  comprising  the 
criteria  and  factors.  This  information  is  calculated  separately  for 
each  worksheet,  but  not  for  each  aid.  For  the  purposes  of  evaluating 
the  methodology,  the  aid  being  analyzed  when  a  metric  problem  was 
encountered  is  not  meaningful. 

There  are  a  total  of  73  metrics  that  have  their  metric  element  questions 
asked  throughout  the  5  worksheets.  These  metrics  are  composed  of  a  total 
of  327  metric  elements,  with  850  questions  asked  in  all  5  worksheets. 
Of  these  850  questions,  317  (37%)  were  reported  as  having  problems  by 
SAIC  analysts.  Of  the  327  metric  elements,  153  (or  47%)  were  reported 
as  having  one  or  more  problems. 

These  problems  were  grouped  into  the  nine  categories  discussed  above. 
Figure  3.4-3a  shows  each  of  these  nine  categories.  The  figure  also 
reflects  the  relative  number  of  metric  elements  reported  for  each  of  the 
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categories.  As  an  example,  19%  of  the  metric  element  comments  fall 
into  the  category  of  "All  or  Nothing"  comments  ( see  Category  5  above ) . 
Not  only  were  varying  numbers  of  metric  elements  reported  in  each 
category,  but  the  categories  themselves  are  of  unequal  criticality. 

Some  of  the  categories  are  in  the  nature  of  formatting  problems.  This 
means  that  they  concern  typographical  errors,  duplication  of  questions, 
questions  which  could  be  nested,  and  other  minor  problems.  These 
questions  can  be  easily  solved  by  reformatting  the  worksheets.  This 
includes  Categories  6,  7,  and  9  (errors,  duplication,  and  nesting 
level),  euid  represents  17%  of  the  problems  uncovered. 

Other  questions  are  more  of  a  procedural  nature.  Unscorable  questions, 
for  exanple,  can  be  handled  by  instituting  procedures  to  guide  analysts. 
This  includes  Categories  4  and  8  (subjective  and  unscorable),  and 
represents  31%  of  the  problems  uncovered. 

The  final,  and  we  believe  most  important  grouping  of  categories,  is  more 
of  a  content  problem.  These  Categories  are  1,  2,  and  5  (inappropriate 
level,  confusing,  and  all  or  nothing).  These  problems  require  changes 
in  the  questions  themselves,  and  comprise  50%  of  the  problems  uncovered. 
(The  remaining  2%  of  the  problems  identified,  in  the  miscellaneous 
category,  were  not  allocated  to  any  of  these  three  groupings. ) 

The  difficulty  in  correcting  these  groupings  of  categories  will  vary. 
Formatting  problems  are  relatively  easy  to  correct,  and  will  not  likely 
generate  much  controversy  or  discussion.  Procedural  problems  are 
somewhat  easily  corrected,  and  can  probably  be  accomplished  without  a 
great  deal  of  discussion.  The  content  problems,  however,  will  be  harder 
to  correct  without  a  great  deal  of  discussion  and  information 
interchange  among  the  Government  agencies  and  contractors  currently 
active  in  the  metric  community.  Future  contracts  and  technical 
interchange  will  help  to  solve  this  problem. 

The  metric  elements  combine  to  form  a  total  of  29  criteria.  Several  of 
the  criteria  were  not  reported  for  any  problems  on  any  worksheet.  These 
criteria  are  distributedness,  effectiveness  communication,  effectiveness 
storage,  generality,  system  accessibility,  system  compatability, 
tracebility,  and  visibility.  The  remaining  22  criteria  were  reported 
for  between  15%  and  100%  of  their  metric  element  questions.  The  criteria 
accuracy,  functional  overlap,  and  virtuality  had  the  highest  percentage 
of  metric  element  questions  reported  with  100%.  Figure  3.4-4  displays 
this  data  for  all  criteria. 

Figure  3.4-5  illustrates  the  percentage  of  the  comments  on  each 
criterion,  relative  to  the  worksheet  involved.  For  each  criterion,  the 
figure  shows  the  percentage  of  metric  elements  reported  out  of  the  total 
present  on  each  of  the  five  worksheets. 

The  criteria  are  combined  to  form  thirteen  factors.  Only  the  factor 
INTEGRITY  did  not  have  any  metric  element  questions  reported.  The 
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highest  percentage  of  metric  elements  reported  was  80%  for  the  factor 
RELIABILITY.  This  was  partly  due  to  the  high  number  of  questions 
reported  for  the  criterion  anomaly  management.  Figure  3.4-6  displays 
this  data  for  all  of  the  factors. 

3.5  Metric  Scoring 

Based  on  the  results  of  the  worksheet  application  for  the  decision  aids 
and  phases,  scores  were  calculated  for  each  aid  for  the  tequi remerT: s , 
preliminary  design,  detailed  design,  and  coding  phases.  Scores  v.ete 
cal lulated  in  two  fashions:  one  method  counted  every  question,  every 
metric  element,  every  criteria,  and  every  quality  factor.  The  other 
meth(xl  counted  elements  as  specified  in  the  SQM  methodology.  Both  of 
these  methods  were  used  because  of  the  research  nature  of  this 
evaluation  project.  All  questions  were  counted  in  order  to  gather  data 
about  every  metric  element,  and  the  specified  questions  were  counted  in 
'  ’der  to  follow  the  guidebook  methodology  as  closely  as  possible. 

Section  2.3  contains  the  results  of  this  scoring  process.  As  was 
‘■•xpected  (because  of  the  quality  of  the  available  documentation),  scores 
did  not  achieve  their  desired  levels.  A  discussion  of  this  problem  is 
included  in  Section  2. 

During  the  scoring  process,  only  one  new  problem  was  identified.  During 
use  of  the  scoresheet  for  the  EFFICIENCY  factor,  an  error  in  the  format 
was  discovered.  This  is  a  presentation  error  only,  and  can  easily  be 
corrected.  The  recommended  correction  appears  in  Section  4.4.4. 
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4.0  RECOMMENDATIONS  AND  CONCLUSIONS 

Based  on  problems  encountered  dut  .no  scftwaie-  qualrr^.  evaluation,  SAIl 
has  compiled  a  set  of  concl  -si-vis  and  tecon-jnended  methodolc-r/ 
modifications.  These  conclusiono  ar  ■  ir.odif ications  are  discussed  in 
this  section,  which  is  organized  as  :  .  ..-.s: 

•  Section  4.1  contains  conclusions  drawn  by  S.MC  based  on  the 
project  work  performed. 

•  Section  4.2  presents  general  recommendations  for  changes  or 
additions  to  the  mechodology  which  are  applicable  to  the 
entire  acquisition  process. 

•  Section  4.3  includes  recommendations  for  modifications  to  th® 
methodology  •-■Jhich  are  applicable  to  the  specification  of 
software  quality  req'.ii  i  ..ments.  These  correspond  to  changes 
recommended  for  specific  paragraphs  in  Volume  II  of  the 
guidebooks . 

•  Section  4.4  discusses  recommendations  applicable  to  soft'-v-are 
quality  evaluation.  These  correspond  to  changes  lecommended 
for  specific  paragraphs  in  Volume  III  of  the  guidebooks. 

4.1  Conclusions 

SAIC  believes  the  SQM  methodology  recommended  in  the  guidebooks  is  son-i 
and  that  it  can  be  very  useful  to  the  acquisition  manager.  The  manner 

of  presentation  in  both  volumes  of  the  muidebooks  can  be  improved,  a.-, :i 

specific  recommendations  for  improvements  are  listed  below. 

The  resulting  estimate  of  system  quality  determined  using  the  .Si:' 
methodology  corresponds  to  that  subjectively  and  intuitively  deterr: :'.e 
by  the  analysts  assessing  both  aids.  Analysts  felt  that  the 

documentation  and  structuring  of  the  code  were  such  that  ♦'.hP'  .  ■ 

numerical  scores  received  were  justified.  However,  these  sc  res  do  not 
mean  that  both  decision  aids  are  "bad."  The  aids  were  intended  to  be 
prototype,  proof-of-concept  developments.  They  were  never  meant  to  be 
used  in  the  field,  nor  to  access  actual  intelligence  data  bases.  The 
aids  were  intended  only  to  show  the  usefulness  and  meaningfulness  of  the 
concepts  in  battle  staff  management.  For  this  reason,  it  is  n.,  t  liv  ly 
that  the  added  expense  of  software  with  high-quality  de-/elopment  was 
needed  or  appropriate. 

The  reaction  of  the  software  developers  at  p.ar  Technology  is 
indicative  of  barriers  to  be  crossed  before  the  SQM  methodology  can  he 
accepted  across  the  DoD  and  industry.  When  presented  -with  many  of  the 
metric  violations,  the  developers  often  seemed  t-:^  f^'  i  Uiat  there  -was  -lo 
need  for  the  system  to  meet  such  a  standard  '■  f  ^r.,jrse  it  would 

be  included."  For  example,  commenting  need  not  be  "-luiLed  because  any 
good  developer  heavily  comments  code.  When  measured,  however,  we  often 


found  that  they  did  not  in  fact  follow  this  basic  standard.  Another 
example  is  in  the  length  of  modules  —  some  were  over  1000  lines  long 
and  only  lightly  commented.  We  believe  that  pressures  and  schedules 
forced  short-cuts  that  were  unnoticed  because  only  implicit  standards 
existed.  When  the  development  and  acquisition  communities  accept 
explicit  standards  and  assessment,  this  problem  will  be  greatly  reduced. 

4.2  General  Recomnendations 

This  paragraph  contains  recommendations  that  are  general  in  nature,  and 
apply  to  the  methodology  spanning  the  entire  acquisition  process.  They 
do  not  reflect  any  particular  paragraph  or  area  in  the  SQM  guidebooks. 

4.2.1  Guidebook  Reorganization 

The  SQM  ^idebooks  are  written  in  a  style  that  mixes  theory  with  the 
methodological  steps  to  be  performed.  There  is  some  separation  in  the 
organization  of  both  guidebooks,  with  Section  4.0  of  each  basically 
containing  the  procedures  to  be  follower?  f'^r  quality  specification  and 
evaluation.  However,  it  v.'ould  b*-  better  to  further  separate  these 
elements.  Section  4.0  of  both  volumes  .should  be  iirectly  concerned  with 
procedures  to  be  executed,  and  contain  clearly  labelled  examples  of  each 
of  the  steps.  The  theory  a.nd  justification  for  the  steps,  along  with 
explanations  of  the  measurement  technolocjy ,  should  all  be  contained  in 
the  earlier  sections  of  each  volume.  We  recommend  retaining  the 
"stand-alone"  nature  of  each  volume,  with  both  volumes  containing  theory 
and  explanation  as  needed. 

4.2.2  Continued  Research 

We  also  recommend  ‘■hat  experimental  work  be  continued  to  prcvide  a 
" rp>a] -wot  Id"  basts  for  the  contentions  of  the  measurement  methodology. 
The  techniques  and  procedures  described  will  (-<’  much  more  acceptahl'^  to 
accjuisition  manage''S  and  developers  when  exp-erimental  ly  ’.’erifir^d  as 
being  cost-effective  and  beneficial  to  o’  'iects. 

It  is  very  important  that  work  be  c-'otinued  to  establish  th^ 
relationship  between  the  SQM  m-^thodology  measurements  of  quality  and  the 
perceived,  real-world  quality  of  systems.  We  need  to  validate  what 
scores  mean  and  how  important  relative  differences  are.  We  need  to 
provide  a  method  for  the  acrjuisition  manager  to  understand  what  he  is 
receiving  when  a  score  for  quality  is  measured  at  .7P,  for  ‘^xamnle,  as 
opposed  to  .95.  This  double  focus  on  specifying  what  tfie  rrana'jet 
requires,  and  providing  information  as  to  what  that  m«ans  in  real world 
terms,  is  important.  No  project  exists  in  isolation,  and  metliods  foi 
providing  measurements  and  information  across  proiects  is  important. 

Until  we  have  full  data  for  detei mining  what  sr.-i^s  ,,,-,1  rocults  moan  in 
absolute  terms,  we  are  concerned  the  the  .cqm  ■  .^.:y  v.iH  appc,,. 

arbitrary.  Metric  question  weights  and  contribu'  i  '  t''tal  scoio'S  is 
one  area  that  reflects  this  concern.  For  example,  consider  the 
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computation  of  the  criterion  anomaly  management,  uced  to  determine  the 
factor  RELIABILI'ry.  At  the  preliminary  design  phasic-  iWorksheet  2),  the 
Communication  Errors  metric  consists  of  four  metrir  elements  (AM. 6(1' 
through  AM.6(4)).  At  bct'i  the  syst“'''  e  ■■>  software  reaui  I'^ment  s  analysis 
level  and  at  the  CSCI  requirement  .■  :  Worksheet  and  Worksheet  1, 

the  only  metric  element  that  com.p:  .  ^rie  Com.muni':at_.on  Eirois  metric 

is  .AM.6(1).  This  means  tliat  at  th*^-  i . : -t  level  ‘if  '-c'lina  ‘■he  -.'alue  of 
AM.6(l!  IS  automatically  weighted  at  prr'-iminary  design  as  ^4  of  that 
of  the  requirements  analysis  pli.-se,  m.erely  because  there  are  3 
additional  metric  elements  for  the  preliminary  desian  ohase. 
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Another  area  cf  resenruh  concerns  t.te  m.e-'iic  jjestions  that  are 
applicable  tc  such  ar-^as  a;  decision  aid  a  -  kr-.cwledac-based  systems,  as 
well  as  such  project'-  as  date  base  miau.i  jemecc  systc-m.s.  These  mere 
specialized  system.s  have  strut  tures  that  a:--  not  easily  measured  under 
the  more  Fortran-oriente  1  exit  i.  in  j  metric  element  questions.  Examples 
of  these  syst<^m'.  are  tn  ...e  c.,  itaininq  data  base  ma:  agement  languages, 
rule-based  systems,  syst -m.s  using  non-von  Neumann  architectures,  highly 
distributed  systems,  and  sy.stems  written  usmg  non-procedural  languages. 
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Some  cf  the  metric  factors,  such  a;  INTEGF.IT/,  do  not  have  q^uestions 
across  all  phases.  These  voids  indicate  that  we  need  to  look  at 
specific  areas  in  order  to  generate  more  m.etric  elements  and  metric 
element  questions.  Each  factor  should  be  traceable  through  each  phase 
of  the  development  process,  and  should  have  representing  questions  on 
each  level  of  worksheet.  Further  studies,  particularly  of  the  newer 
systems  described  above,  will  help  fill  this  gap. 

Automation  of  metric  evaluation  is  important,  and  is  an  area  already 
undergoing  research  and  development.  In  addition  to  analyzing  the  types 
of  questions  that  may  be  counted  and  evaluated  automatically,  we  believe 
consideration  should  be  given  to  the  questions  that  provide  the  most 
information  (see  Section  4.3.4)  and  the  skill  level  of  those  analysts 
who  can  answer  the  questions  (see  Section  4.4.1).  In  addition,  we 
recommend  that  efforts  concentrate  on  the  questions  contained  in 
Worksheets  3B  and  4B.  This  is  because  of  the  application  of  these 
worksheets.  Questions  or  earlier  worksheets  are  completed  once  for  the 
system,  or  once  for  each  CSCI.  Even  in  large  systems,  the  number  cf 
CSCIs  is  generally  sinall,  and  question  evaluation  relatively  rapid.  For 
even  smaller  projects,  however,  the  unit-level  questions  of  3B  and  4B 
are  repeated  a  great  many  times.  Each  of  these  questions  must  be 
answered  for  every  single  unit  in  the  system,  and  this  can  become  an 
extremely  time-consuming  tasks.  Any  automation  efforts  on  those 
worksheets  will  provide  a  high  level  of  return  in  reduction  of  h\jrr, an 
labor  requirements. 
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A  great  deal  of  work  has  been  completed  lately  concerning  the  guidebook 
methodology  and  other  metric  efforts.  This  work  sh^vild  be  integrated 
into  a  revised  and  enhanced  approach  and  guidebook. 
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4.2.3  Training 


SAIC  recommends  that  seminars  be  developed  and  offered  to  acquisition 
managers  to  train  them  in  the  use  of  the  software  quality  measurement 
methodology.  A  one  or  two-day  seminar  would  allow  the  acquisition 
manager  to  receive  materials  on  the  methodology,  to  be  able  to  discuss 
its  use,  to  ask  questions  directly  of  experts,  and  to  engage  in 
round-table  discussions  about  the  methodology  and  its  worth  and 
usefulness.  Siach  seminars  or  classes  make  a  great  difference  in 
understanding  a  system  and  its  use.  It  is  easier  to  understand  even 
technical  and  detailed  instructions  once  a  thorough  background  has  been 
acquired.  This  will  also  help  motivate  the  acquisition  manager  to  use 
the  methodology  and  continue  his  education  about  its  features. 

The  specifying  of  software  quality  goals  would  be  aided  by  automating 
the  goal  specification  process  in  the  form  of  an  expert  or 
knowledge-based  system.  This  automation  would  help  the  acquisition 
manager  effectively  ard  quickly  specify  quality  goals.  A  class  and 
demonstration  for  this  system  would  he  effective  in  increasing 
acceptance  of  the  metric  measurement  process. 

In  addition,  we  also  recommend  training  classes  and/or  materials  be 
cr'eated  for  those  '-v+io  will  be  collecting  data  and  evaluating  the 
worksheets.  This  training  would  greatly  aid  in  reducing  subjectivity, 
and  in  allowing  evaluators  to  share  their  knowledge  and  experiences. 
The  automation  of  metrics,  using  such  tools  as  the  Automated  Measurement 
System,  will  also  be  enhanced  by  providing  training.  The  training  could 
increase  SQM  technology  effectiveness,  and  thereby  increase  its 
acceptance  by  the  software  community  in  general. 

4.2.4  Framework  Modifications 

SAIC  recommends  that  the  basic  metric  framework  be  retained,  but  does 
.'lave  some  specific  recommendations  to  improve  it.  The  recommended 
changes  are  minor,  but  we  do  not  advocate  more  substantial  changes  until 
more  data  has  been  gathered  validating  framework.  This  data  will  be 
used  to  support  the  relationships  ourc'^nrly  existing  (metric  to  criteria 
to  factor),  and  will  aid  in  establishing  new  relationships. 

One  recommended  area  concerns  the  quality  factors  EFFICIENCY  and 
INTEROPERABILITY.  As  currently  implemented,  both  of  these  factors 
basically  measure  how  well  the  system  meet.s  its  own  requirements  (e.g., 
is  as  efficient  as  it  was  recniired  to  be,  or  operates  with  other  system 
as  required).  Neither  factor  provides  informati^;n  in  absolute  terms. 
INTEROPERABILITY,  in  particular,  does  not  assess  the  future  ease  of 
connecting  to  other  systems,  >  measures  only  if  the  system  can  be 
connected  to  those  specified  ir^  r  .-.ti  requiremonr  documentation.  SAIC 
recommends  that  more  research  br-  ronducted  oonce'-rii'-^o  ^-hese  two  factors 
in  particular.  It  may  be  that  both  should  be  i  Trom  the  faotor 
level  of  the  framework,  and  placed  under  the  ^actoi  COPHECTNESS  as 
criteria. 


In  addition,  two  of  the  quality  factors  have  few  metric  element 
questions  to  calculate  their  values.  Both  IKTEGRITY  and  USABILITY  need 
to  be  evaluated  for  further  criteria  itio  metric  luesticc,  creation. 

4.3  Quality  Goal  Specification  Rue  -  .  dations 

Recommendations  and  conclusions  contained  in  this  seer.]  on  are  appli  cardie 
to  Volume  II  of  the  SQM  guidebooks,  the  Soft'-'are  Quality  Specification 
Guidebook  [BOE-2].  ^  ' 

4.3.1  New  Factor  Rei ationships 

See:  Guidebook  Volum^e  'Z,  Para.  1.1. 3. 3,  "Quanto  f ioaticn  of  Relation¬ 
ships" 

One  of  the  early  tasks  to  be  performed  by  the  acquisition  manager  is  the 
specification  of  goal  scores  for  the  quality  factois.  Part  of  this 
process  is  the  considers t ion  of  the  relationship  among  quality  factors, 
This  relatiorship  can  be  of  a  positive  nature,  or  it  can  be  neaative. 

The  software  quality  measurement  (SQM)  methodology  states  that  the 
factor  EFFICIENCY  is  interrelated  with  all  other  factors  (except 
CORRECTNESS)  in  a  particularly  negative  way.  This  means  that  the  more 
efficient  a  system  is,  the  less  high  its  quality  can  be  when  measured 
for  other  factors.  Volume  II  of  the  guidebook  presents  information  in 
Table  4. 1.3-3  that  shows  these  negative  interrelationships.  EFFICIENCY 
has  varying  negative  impacts  on  every  other  quality  factor,  except  for 
the  factor  CORRECTNESS.  These  contentions  seem  to  be  intuitively  valid. 

The  measurement  technology  (i.e.,  worksheet  evaluation),  hov/ever,  does 
not  support  this  in  any  way.  The  only  criteria  applicable  to  the  factor 
efficiency  are  effectiveness-communication,  effectiveness-processing, 
and  effectiveness-storage.  These  relate  to  no  other  quality  factor. 
The  metric  elements  related  to  these  criteria  are  not  used  for  any  other 
criteria. 

The  nature  of  these  metric  element  questions,  summarized  for  reference 
in  Table  4.3-1,  does  not  bear  out  the  contention  of  negative  im.pact.  The 
complete  satisfaction  of  each  question  (i.e.,  a  score  of  "1"  or  "yes" 
for  all  elements)  would  not  cause  a  negative  impact  on  any  other  factor. 

A  high  quality  score  on  EFFICIENCY  and  simultaneously  on  other  pjality 
factors  is  perfectly  possible  given  the  present  framework.  We  believe, 
however,  that  EFFICIENCY  does  have  a  negative  relationship  to  ntlier 
factors.  This  negative  relationship  .should  be  reflected  in  the  SQM 
framework,  (Positive  relationships  are  discussed  at  the  end  of  this 
section. ) 


Any  negative  relationships  among  the  factors  should  be  supported  by  the 
sharing  of  common  metric  elements.  If  EFFICIENCY  has  a  negative  impact 


TABLE  4.3-1  SUMMARY  OF  EFFICIENCY  METRIC  ELEMENTS 


WORKSHEET/MNEMONIC 

QUESTION 

0 

EC.l(l) 

Performance  requirements  for  communication 

EP.l(l) 

Performance  requirements  for  processing 

EP.1(3) 

Optimizing  comipler  or  assembly  lang. 

EP.1(5) 

Overlays  required 

EP.2(1) 

Data  storage  and  processing 

EP.2(2) 

Efficient  processing  required 

EP.2(3) 

Source  code  supporting  variable  initialization 

ES.l(l) 

Data  storage  requirements 

ES.1(2) 

Virtual  Storage 

ES.1(5) 

Dynamic  memory  management 

ES.i(7) 

Optimizing  compiler 

ES.1(8) 

Avoid  redundant  storage 

1 

ECUl'i 

Performance  requirements  for  communication 

EP.I(l) 

Performance  requirements  for  processing 

EP.U3) 

Optimizing  compiler  or  assembly  lang. 

EP.KS) 

Overlays  required 

EP.2fl) 

Data  storage  and  processing 

r:P.2(2) 

Efficient  processing  requir^ 

EP.2(3) 

Source  code  supporting  variable  initialization 

ES.l(l) 

Data  storage  requirements 

ES.1(2) 

V  irtual  Storage 

ES.1(5) 

Dynamic  memory  management 

ES.1(7) 

Optimizing  compiler 

ES,1(8) 

Avoid  redundant  storage 

2 

EP.1(5) 

Overlays  used 

EP.2(2) 

Storage  organized  for  efficient  processing 

EP.2(3) 

Source  code  allow  variable  initialization 

EP.2(6) 

Efficient  processing  of  related  similar  items 

ES.U2) 

Virtual  storage 

ES.K.-i) 

Dynamic  memory  management 

ES.1(8) 

Free  from  redundant  storage 

3 

EP.U2) 

Loops  with  non-loop  dependent  statements 

EP.1(4) 

Cvirnpound  expressions  recalculated  needlessly 

EP.  1  (6) 

Bit/Byte  packing/unpacking  needlessly  in  loops 

EP.2(4) 

Arithmetic  expressions  with  different  size  items 

EP.2(5) 

Mixed  data  types  in  arithmetic  expressions 

EP.2(7) 

Data  item  mtxlified 

ES.1(6) 

Data  packing  operations 

4 

EP.1(2) 

Loops  with  non-loop  dependent  statements 

EP.1(3) 

Uni”^  optimized  for  processing  efficiency 

EP.1(4) 

Compound  expressions  recalculated  needlessly 

EP.1(6) 

Bit/Byte  packing/unpacking  needlessly  in  loops 

EP.2(4) 

Arithmetic  expressions  with  different  size  items 

EP.2(5) 

Mixed  data  types  in  arithmetic  expressions 

EP.2(7) 

Data  items  modified 

ES.1(6) 

Data  packing  operations 

on  RELIABILITY,  then  common  elements  should  ensure  that  a  higher  score 
in  one  means  a  lover  score  in  the  ether.  Otherwise,  the  factors  are 
independently  measured  and  no  relationship  attributes  need  to  be 
considered. 

This  will  likely  mean  that  juestic.:.-  elements  must  be  considered  in 
pairs.  As  an  example,  consider  this  q;‘  -t  on  f-om  Worksheet  4B-,  MO. 1:8) 
"Is  temporary  storaot?  'i.e.,  workspace  reserved  tor  into -r-diate  or 
partial  results)  used  only  by  this  unit  during  execution  (i.e.,  is  not 
stored  with  other  units)?"  For  the  criterion  modularity,  this  question 
would  contribute  in  a  positive  fashion.  A  "yes"  an.swer  vrould  receive  a 
score  of  1 .  In  contrast,  it  is  a  more  efficient  use  of  mem.ory  space  to 


n.ouin  receive 


share  storage.  This  .ao-an.',  that  a  ".no"  answer  sn.ould  receive  ?  score  -f 
1  relating  to  the  crito’ -on  effectiveness- -=toraae .  To  accomplish  this, 
it  is  possible  that  the  qijestion  could  be  paired,  as  shov.n  below. 

MO. 1(8)  Is  te.mporary  storage  (i.e.,  workspace  reserved  for 
intermediate  or  partial  results)  used  only  by  tnis  unit  during 
execution  (i.e.,  is  not  stored  with  other  units)? 

ES.x(n)  Does  this  unit  share  temporary  storage  (i.e.  workspace 
reserved  for  intermediate  or  partial  results)  with  other  units 
during  execution? 

This  "paired"  approach  fits  in  effectively  with  the  workbook  methodology 
we  recommend  below  (paragraph  4.4.1)  for  data  collection.  The  data 
would  be  collected  once  using  the  workbook,  and  then  used  to  answer  each 
of  the  two  separate  questions.  This  means  that  no  additional  work  would 
be  required  to  gather  the  paired  data  items. 

Existing  questions  can  be  used  to  build  this  interconnection  among 
related  factors.  Table  4.3-2  is  a  partial  list  of  these  existing 
elements,  along  with  the  newly-developed  "pair"  element.  The.se  elements 
are  too  few  to  fully  justify  the  relationships  we  believe  exist.  This 
means  that  more  questions  are  needed  for  each  factor  so  that  calculated 
values  will  correspond  to  real-'world  observations . 

One  advantage  of  this  paired  relationship  is  in  its  effect  on  the 
tendency  of  metric  methodology  users  to  try  to  force  every  score  to  a 
value  of  "1"  or  "yes."  Some  users  have  believed  that  the  purpose  of  the 
technology  is  to  create  a  resulting  system  that  has  every  metr.rc  element 
question  answerable  with  a  "yes"  or  a  full  score  of  "1."  The  technology 
actually  goes  beyond  that,  and  is  an  attempt  to  reflect  system  quality 
in  a  cost-effective  manner.  This  means  that  some  "lacks"  in  a  system 
should  not  be  corrected,  because  the  cost  would  not  ju.stify  the 
benefits.  Paired  metric  questions  would  not  allow  all  scores  to  reach 
"1",  and  would  better  reflect  the  situations  existing  in  industry  today. 

Also  to  be  considered  in  this  regard  are  the  complementary  factor 
relationships  described  by  the  guidebooks.  P'0,11  quality  factors 
(RELIABILITY,  CORRECTNESS,  MAINTAINABILITY,  and  VERIFIABILITY)  are 
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TABLE  4.3-2  "EFFICIENCY"  SAMPLE  PROPOSED  QUESTIONS 


AM.1(2) 


EP  x(n) 


a.  How  many  error  conditions  are  required  to  be 
recognized  (identified)? 

b.  How  many  recognized  error  conditions  require 
recovery  or  repair? 

c.  Calculate  b/a  and  enter  score. 

a.  How  many  error  conditions  are  required  to  be 
recognized  (identified)? 

b.  How  many  recognized  error  conditions  require 
recovery  or  repair? 

c.  Calculate  1-b/a  and  enter  score. 


AM.  1(4) 


How  many  instances  of  the  same  process  (or 
t unction,  subfunction)  being  required  to 
execute  more  than  once  for  comparison 
purposes? 


EP.x(n) 


a.  How  many  instances  of  the  same  process  (or 
function,  subfunction)  being  required  to 
execute  more  than  once  for  comparison 
purposes? 

b.  Calculate  l/(a+l )  and  enter  score. 


AM.3(2) 


EP.x(n) 


.Are  there  requirements  to  range  test  all  critical 
(e.g.,  supporting  a  mission-critical  function) 
loop  and  multiple  transfer  index  parameters 
before  use? 

Is  the  CSCI  free  from  requirements  to  range  te 
all  loop  and  multiple  transfer  index  parameters 
before  use? 


TABLE  4.3-2  "EFFICIENCY  "  SAMPLE  PROPOSED  QUESTIONS  (Con'd  i 


WORKSHF.HT/MNEMOMC 


AM.3(3) 


EP.x(n) 


AM. 7(2) 


EP.x(n) 


AP.4(1) 


EP.x(n) 


Are  there  icquireiiients  to  range  test  all  cntieal 
(e.g.,  supporting  a  mi.ssion-critica!  functioi;  > 
subscript  values  before  use? 

Is  the  CSCl  free  of  requirements  to  range  test 
all  critical  (e.g..  sapponing  a  mission  critical 
func.lon)  subscript  values  before  use'.’ 


Are  there  requirements  to  periodically  check  all 
adjacent  nodes  or  interoperating  systems  for 
operational  .status? 

Is  the  CSCI  free  from  requirements  to 
periodically  check  all  adjacent  nodes  or 
interoperating  systems  for  operational  status? 


Is  there  a  requirement  to  avoid  or  to  limit  the 
use  of  microcode  instruction  statements? 

Is  the  CSCI  free  from  a  requirement  to  avoid  or 
limit  the  use  of  microcode  instruction 
statements? 
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described  as  being  conplementary  to  all  other  factors.  Low  scores  on 
any  of  these  factors  mean  that  other  quality  factors,  even  if  high 
scores  are  achieved,  have  to  be  of  lower  quality.  SAIC  agrees  with 
recommendations  by  other  contractors  that  this  complementary 
relationship  should  not  be  included  in  the  quality  factor  framework. 

Factor  quality  scores  should  indicate  quality,  regardless  of  the  scores 
achieved  for  other  factors.  This  does  not  mean  that  the  factors  have  no 
overlap,  but  all  measures  necessary  for  a  factor  should  be  evaluated 
with  that  factor.  Measures  should  not  depend  on  other  scores  (e.g., 
INTEGRIl’Y  IS  not  high  unless  the  four  factors  listed  above  are  also 
high).  We  recommend  that  the  applicable  criteria  of  the  four  factors 
'"hat  are  ronsidi-^red  to  be  cr-m,plementat  y  be  included  in  each  quality 
factor  in  the  framework,  so  that  each  may  s^and  alone.  This  common  set 
of  data  will  likely  ensure  that  the  score  for  a  far*"''!'  suen  as  INTEGRITY 
'■annot  be  high  if  .scorr-s  for  the  four  ■'Oinplpmentary  factors  are  low. 
-.esearch  ef'-uts  will  need  to  identify  what  these  applicable  criteria 
ire. 


I. 1  2  Quality  Factor  Definition 

ter;  Guidebook  Volume  II,  Paragraj'hs  3.1.1  i-'d  -1 . 1 . 2 . 3  ,  "Factor  Defi- 
nirions  and  Rating  Forrm.:las" ,  and  "Oi.iality  Requirements  Survey" 

^he  .vQri  gu  debook  defines  software  qirality  f.a'-tors  in  twe,  table.s  ;  Volume 

II,  Tables  3.1-1  and  4, 1.2-4,.  Volume  III,  in  Tabl‘d ,s  3.1-1  an'l  3.1-2, 
onfc.ns  the  same  information.  These  deficit  lon.s,  shov.-n  here  in  Table 

■  ■'  for  referenr.'e,  nre  ••ery  misleasUng.  Tlieie  is  absolutely  no 
,  ,.rr;e  X.c  .date  tl-:at  the  calculated  s-'.-eres  f  or  FFl  ■  .dlENCY,  foY  ex.ample, 

■ a-c  to  the  frumula  gi’.'r.ci  in  the  rpe'h'yloi.siTy .  Ihis  formula  i  the 
I  im  lemenr  of  the  ratio  of  actu-.d  to  allocate-d  usaue  may  indeed  be  a 


niea.'.'  u'cment 


.ated 


u  th 


;f  efriciency,  but 
factor  mFFTCIFj'F'Y . 


it)  no  v;.'ii 

C'Op  imi-' 


IS  ‘iiat  spe-'-i  f  ying  -a  goal  of  9  fc;’  EFmT""=':'CV, 
using  the  wo. ’-usheets ,  me-'.nt  tha‘‘  ‘he  ,, 
ij.. 1 1 1  cation  IS  .9  tout  of  t*'.e  ailecatoh  .a.-. 


been 
■cr  1  tend , 


!  n  1 1 


oy  ( 1.’  "  a  . 
el  I  as  the 


ne 


1  n< . 


■a ; 


relates  t. •  the  numbers 
arion  in  these  tables 
c  to'-eiving  a  score  of 
;  d  measured  resource 
onclusion  ha.o  not 
siy  weighting  of 
1C  elements,  the 


relat  ionship  is  even  more  iiff  o.  ait  t- 


The  use  of  these  definitions  i.y  th^se  r‘=w  to  .software  , measurement 
technology  can  cause  difficulty.  SAIG,  as  .in  example,  as  be-:i  involved 
in  the  development  of  the  Joint  Forward  Air  Pei'enso  Test  Bed  '  JFPJ\D)  for 
the  Air  Force.  Using  the  SQM  guidf’b'''':ks,  the  '’FA/vD  project  office 
requested  that  the  system  be  buil*-  such  as  .a^hie-ve  a  final  quality 
rating  of  .996  for  RELIABILIIV .  r»  ■”/  calculatt''-!  ‘-his  rating,  u.sing 
guidebook  Table  3.1-1,  by  decidi: ta.v  4  erf'ots  in  ^ach  1000  linos  of 
code  was  an  acceptable  number.  Inoy  expected  the  RELIABILITY 

scores  achieved  at  each  phase  would  rf-flect  that  ’-’u-nf'  r,  -ind  result  in  a 
fielded  system  that  had  4  or  les.s  ‘oi'ors  f^'’  ".e-u  leoo  rjf  cede. 

At  this  point  in  time,  there  is  no  'lata  to  supiynt  that  i elationship . 
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Oua'ity  factor 


Reliability 


Survivability 


Usability 


Correaness 


Maintainability 


Verifiability 


Exoandabiiity 


Flexibility 


Interoperability 


Portability 


Reusability 


Rating  formula 


Alloc  jted  util  tat. on 


Lines  Qf  'Ode 


Lines  of  code 


'•  0  1  (average  'abor. 

days  to  f'x 


Effort  to  verif 


effort  to  develOD 


^  effort  to  expand 
Effort  to  develoo 


1-  0  05  (average  labor- 

days  to  change) 


Effort  to  couQie 
Effort  to  deveto 


^  Effort  to  transport 


Effort  to  deveio 


j  Effort  to  convert 
Effort  to  develop 


value 


O  » « '  I  ;  3 1 '  c7 


e 


£rro.  S/'L  .'C 


Value 


0  9995  0  9997 


5/1  000  3-1  OOfO 


0  9995  I  0  9997 


0  7  I 


Davweors 


Ave'acie  'acor-cavs 


Value 


°.'o  eHan 


Value 


%  effort 


.•■'0  6-2 


Average  labor-cays 

ao 

20 

Value 

09 

0  95 

%  effort 

10 

5 

Value 

09 

0  95 

%  effort 

10 

5 

Value 

04 

0  6 

%  effort 

60 

40 

I 


As  research  continues  over  time,  formulas  will  be  developed  that  show 
what  quality  measurement  values  mean  in  terms  of  such  things  as  resource 
utilization,  errors  per  line  of  code,  effort  to  fix,  and  effort  to 
convert.  At  the  present  time,  however,  it  is  very  misleading  to  imply 
that  we  know  what  quality  scores  mean  in  terms  of  absolute  numbers. 
Currently,  they  are  relative  indicators  only.  We  recommend,  therefore, 
that  any  reference  to  these  rating  formulas  be  removed  from  the 
guidebooks . 

In  replacement  for  these  formulas,  we  recommend  using  more  concrete 
examples  and  descriptions  of  the  factors.  These  descriptions  can  be 
dia^Ti  from  the  criteria  and  metric  levels  of  the  measurement  framework. 
As  an  example,  consider  the  quality  factor  MAINTAI.'^IABILITY.  If  the 
ac (uisition  manager  is  specifying  that  he  wishes  a  system  to  he 
maintainable,  and  if  he  is  to  later  accent  measurement  of  the  level  of 
acc.i  evement  of  that  factor,  t.hen  he  should  have  some  idea  of  what  is 
considered  to  make  uo  a  maintainable  system.  This  can  be  described  in 
-,-'■-03  of  v.’h.at  is  going  to  be  used  to  assess  how  much  "maintainability" 
IS  present. 

'.iie  products  of  the  development  process  that  are  assessed  are  both 
do'-'uments  and  software  code  itself.  If  the  system  is  to  be 

maintainable,  it  means  then  chat  the  documents  and  the  source  code 
snuuld  have  chara  :terist :cs  that  promote  maintainabililty.  Using  the 
dio* acteristics  that  are  evaluated  has  the  added  advantage  of  giving  the 
a~quisiti  manager  more  insight  into  v;hat  is  being  measured.  Table 
-■i.ri-4  is  an  example  of  how  this  might  be  accomplislied . 

Oocuments  should  be  accessible,  well-st ''jctured,  clearly  and  simply 
wiicten,  depict  control  and  data  flow,  be  indexed,  be  separated  by 
system  functions,  and  list  all  operational  capabilities,  .Standards 
should  require  such  things  as  commenting  glbal  data  and  commenting 
•-"uiables.  Code  should  he  structured,  inden  ed,  of  reasonable  size, 
etc.  The  framework  indicates  that  item.s  havirg  ‘These  characteristics 
are  easier  to  maintain.  A  table  tht  shows  the  product  and  the 
attributes  that  make  it  ''f  higher  .  aiity  ■vould  be  useful.  A 
non-software  oriented  person  should  b'  -.bie  to  use  these  tables  to 
understand  what  is  being  assessed  and  what  the  •'T'laliry  factors  indicate, 
because  the  technical  conrenr  ne^^d  nor  be  liiah.  The  data  would  provide 
a  positive  look  at  what  i.s  meas'c.red,  ana  what  the  assessment  numbers 
mean . 

4.3.3  Factor/Criteria  Interrelationships 

See:  Guidebook  Volume  II,  Paragraph  4. 1.3.1,  "Shar<=f]  Criteria" 

The  acquisition  manager  correla*-  .  c.tem  quality  fa  Tore  with  software 
quality  factors,  and  additicna.iv  -;orr<=lat'^s  "h'’  coftware  quality 
factors  with  the  criteria  which  constitut-^  The  present 

methodology  does  not  provide  adeqaiate  deti'iLr  ’  •  "  '■  ha  sc  softv;are 

quality  factors  in  terms  of  the  criteria.  The  ■icquisition  manager  who 
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TABLE  4.3-4  SAMPLE  "MAINTAINABILITY”  DESCRIPTION 


PRODUCT 


All  Documentation 


Software  Standards  and 
Procedures  Manual 


Software  Requirements 
Specification 

Software  Top  Level 
Design  Document 


Software  Detailed 
Design  Document 


Source  Code 


CHARACTERISTICS  ENHANCING  MAfNTAi.N’ABlLlTV 


Accessible 
Wei  structured 
Clearly  and  simp  f  written 
Indexing  scheme  used 

Require  code  to  have; 

commented  global  data 
comment  variables 

Design  standards  for  unit  prologues,  comments, 
and  unit  structures 

Depict  control  and  data  P.ow 

List  all  software  operational  capabilities 

Standardized  design  representation 
Calling  sequence  protocol  established 
External  I/O  protocol  and  format  established 
Error  handling  required 
Eunctions  always  referenced  by  same  name 
Data  representation  standardized 
Data  naming  consistent 
Global  data  defined 

Consistent  calling  sequence  parameters 
Use  structured  design  techniques 
Comply  with  standards 

Follow  standards 

Units  each  have  single  name 

Unit  size  small 

Control  variable  passing  minimized 
Local  storage 

Single  objective  in  each  unit 

Single  entrance  and  exit  in  each  unit 

Branching  levels  low 

Control  flow  is  top  to  bottom 

Few  negative  or  compound  Boolean  expressions 

Few  loops  with  unnatural  exits 

No  selfmodifying  units 

Comply  with  standards 
Unique  data  names 
Commented 

Written  in  a  high  order  languare 
Data  names  descriptive 
Indented  and  blocked  logically 
Single  exit  and  entrance  in  each  unit 


attempts  to  thoroughly  understand  the  impact  of  his  direction  at  the 
factor  level  upon  the  criteria  at  the  design  level  must  understand 
definitions  that  are  likely  to  be  too  technical. 


Ideally,  goal  specification  could  be  conducted  using  an  automated 
decision  aid  which  employs  optimization  techniques.  Short  of  this 
enhancement  to  the  present  methodology,  providing  a  graphic 
representation  of  the  interrelationships  would  enable  goal  specification 
to  be  more  effective.  The  interrelationships  included  would  be  those 
between  each  quality  factor  (and  component  criteria)  and  the  other 
influencing  quality  factors  and  criteria. 

Figure  4.3-1  combines  the  information  contained  in  several  SQM  guidebook 
figures  and  tables,  and  represents  the  quality  factor  FLEXIBILITY,  as  an 
example.  One  such  figijre  for  each  quality  facto*  would  provide  an 
improvement  over  the  c'nplex  .methodology  currently  used.  The  figure  is 
broken  into  four  .Tiadrants:  the  two  left  quadrants  are  associated  with 
factors,  the  two  rignt  quadrant's  are  associated  with  criteria,  the  two 
upper  qijadrant.3  arc  associated  with  positive  influence,  and  the  two 
lower  with  negative.  Arrows  into  the  subnect  quality  factor  and  its 
criteria  indicate  influences  upon  them;  arrows  out  indicate  the  subject 
factor  and  its  criteria's  influence  upon  other  factors  and  criteria. 
This  figure  is  shown  as  an  example,  and  it  is  possible  to  include  other 
pertinent  information.  As  an  example,  costs  associated  with  factors 
could  be  included  in  a  graphic  representaion . 

Using  this  figure  alone,  it  is  easy  to  comprehend  that  the  quality 

factor  FLEXIBILITY  is  positively  influenced  by  the  quality  factors  of 
COPKECTNESS  and  MAItJTAINABILIT't',  and  negatively  influenced  by 
FUPVIVABILITY.  .Additionally,  the  figure  indicates  the  criteria  that 

.iecermine  FLEXIBILITi'  and  th.ose  others  wnich  influence  FLEXIBILITY.  The 
criteria  of  consistency  and  traceability  'in  the  "positive  criteria"  or 
upper  right  quadrant'  have  a  positive  influence  on  FLEXIBILITi,  while 
reconfigurability  (in  the  "negat'--*^  criteria"  -'r  lower  right  quadrant), 
has  a  negative  influence.  This  figi-jr^  demonstrates  nine  types  of 
relationships : 

•  Positive  factors  influencing  this  factor 

•  Negative  factors  influencing  this  factor 

•  Positive  criteria  influencing  this  factor 

•  Negative  criteria  influencing  this  factor 

•  Other  factors  which  are  p'  ' itively  influenced  by  this  factor 

•  Other  factors  which  are  r-  latively  influenced  by  this  factor 

•  Other  criteria  which  are  positively  influep' i  Uv  factor 
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•  other  criteria  which  are  negatively  influenced  by  this  factor 

•  Criteria  which  make  up  this  quality  factor 

Cost  ranges  could  also  be  included  in  the  figure  to  indicate  the  range 
of  cost  for  the  subject  quality  factor  for  system  acquisition  phases, 
and  additionally  indicate  the  cost  impact  for  each  phase  for  each 
quality  factor  that  influences  the  subject  factor.  This  would  enable 
the  individual  tasked  with  goal  specification  to  see  not  only  the  cost 
impact  of  the  emphasis  of  the  subject  quality  factor,  but  also  the  cost 
impact  of  each  of  the  other  factors  influencing  the  subject  factor 
positively  or  negatively,  for  each  acquisition  phase.  These  ranges 
could  be  presented  graphically,  for  ease  of  use. 

4.3.4  Ninimum  Effective  Set  of  Measurements 

See:  Guidebook  Volurrie  11,  Paragraph  4.1.4,  "Consider  Costs" 

Another  method  to  aid  the  acquisition  manager  in  the  specification  of 
.software  q’..  lity  goals  would  be  to  aid  him  in  determining  the  most 
cost-effective  minimum  subset  of  factors,  criteria,  and  metrics  to  be 
used. 

fiqjre  4.3-2  displays  a  matrix  of  the  quality  criteria  and  the  quality 
factors  to  which  they  apply.  There  are  29  criteria  used  in  determining 
li  quality  factor  scores.  The  distribution  of  the  number  of  quality 
factors  per  criteria  is  d.isplaYed  in  the  figure.  There  are  53 
intersections  of  factors  and  criteria  represented,  which  we  call  "cells" 
for  the  purpose  of  this  analysis.  Those  criteria  which  are  employed  in 
the  determination  of  3  or  more  quality  factors  actually  account  for  49% 
of  the  53  cells  in  the  matrix.  Thus,  tem.porarily  ignoring  the  number  of 
questions  per  cell  and  the  associated  relative  difficulty  in  providing 
the  answers,  the  questions  relating  to  the  criteria  of  generality, 
independence,  modularity,  self-descriptiveness,  and  simplicity  provide 
almost  one  half  of  the  quality  factor  information.  These  criteria  are 
only  one  sixth  of  those  measured.  :  -milar  analysis  reveals  that 
modularity  alone  provides  15%  (3  .-ells  if  53)  of  the  information,  and 
modularity,  self-descriptiveness,  and  simplicity  together  provide  38% 
{ ~’0  cells  of  53)  of  the  information  indicated  by  "x's."  These 

relationships  are  indicated  in  Figure  4.3-3. 
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Each  criteria  is  scored  by  av^^raging  t.he  values  assigned  to  its 
applicable  metric  element  questions.  The  actual  number  of  metric 
element  questions  asked  for  each  criteria  is  not  significant.  Because 
of  the  selection  process,  varying  numbers  of  q:.ies“ions  will  be  scored 
across  projects  as  the  mechodolc  used.  For  this  rea.son,  we  are  not 

analyzing  here  the  number  of  o  .i  i  -ns  used  to  compute  each  of  these 
multi-factor  criteria. 

For  reference,  however,  there  are  sc’lf  -  rt  i  •.■■^ness  metric 

elements,  30  simplicity  metric  elements,  3  gene’ai''y  metric  elements, 
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14  modularity  metric  elements,  and  7  independence  metric  elements. 
There  are  a  total  of  327  metric  elements  over  the  worksheet  set.  This 
means  that  5  of  29  criteria  (17%)  contain  71  of  327  metric  elements 
(21%),  and  make  up  49%  of  the  factor  scores  when  criteria  weighting  is 
not  included.  One  implication  of  is  that  these  49%  of  the  metric 
elements  might  be  prime  candidat  for  automating  data  evaluation. 
While  some  metrics  are  relatively  easy  to  automate,  even  the  more 
sophisticated  metrics  would  provide  a  great  deal  of  information  if 
selected  from  this  set. 

A  potentially  viable  alternative  to  full-scale  software  quality 
evaluation  with  seven  worksheets  would  be  to  evaluate  a  reduced  set  of 
quality  criteria.  .Since  five  criteria  'generality,  independence, 
modularity,  simplicity  and  self-descriptiveness)  include  almost  ha) f  cf 
the  total  number  of  metric  elements,  they  alone  could  be  used.  This 
would  provide  no  information  for  the  factors  C0P13ECTNESS ,  EFFICIEMCY, 
II'TTEGRITY,  and  USABILITY,  bur  partial  information  would  be  provided  for 
all  the  other  factors. 

Another  approach  that  could  be  suggested  to  acquisition  managers  is  to 
measure,  in  the  early  stages  of  development  (perhaps  up  to  and  including 
preliminary  design),  only  those  criteria  whose  q’uality  factors  have  been 
emphasized  as  requiring  excellence.  This  could  be  used  to  drive 
detailed  design,  which  would  in  turn  drive  coding,  in  the  intended 
direction.  In  this  manner,  the  evaluation  required  is  less  exhaustive 
and  focused  early  to  ensure  that  quality  is  high  where  it  is  most 
required. 

The  third  potential  approach,  used  already  in  some  projects,  is  to 
select  a  subset  of  quality  factors.  These  factors  are  measured  fully 
for  each  development  phase. 

4.3.5  Functional  Allocation 

See:  Guidebook  Volume  II,  Paragraphs  4.1.1  and  4.4.1,  "Identify 

Functions",  and  "Review  Requirements  Allocations  and  Evaluation 
Formulas" 

The  first  step  that  the  acquisition  manager  must  complete  is  the 
identification  of  the  functions  of  his  system.  To  perform  this  process, 
he  identifies  each  system  function  which  is  supported  by  software  and 
which  will  have  separate  quality  requirements. 

The  SQM  methodology  disojsses  the  different  quality  goals  possible  among 
different  system  functions,  and  includes  a  sample  of  the  functions 
associated  with  an  example  command  and  control  system.  In  addition  to 
these  considerations,  the  acquisition  manager  must  examine  functions 
unique  to  software  (for  example:  man-machine  interfaces,  executives, 
mission  training,  and  integrated  test  functions). 

It  is  up  to  the  quality  assessment  evaluator  to  allocate  these  functions 
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among  the  actual  software  Computer  Software  Configuration  Items  (CSCIs) 
that  are  developed.  This  process  is  described  in  paragraph  4.1  of 
Volume  III  of  the  guidebooks.  The  acquisition  manager  then  assesses 
this  allocation  as  described  in  Volume  II,  paragraph  4.4.1. 

SAIC  recommends  that  much  more  guidance  be  supplied  to  the  manager 
concerning  both  the  derivation  of  functions  for  which  quality  goals  are 
set  and  for  assessing  the  requirement  allocations  and  evaluation 
formulas . 

One  possible  approach  is  to  use  a  system-wide  specification  of  quality 
goals  unless  certain  criteria  are  met  that  force  specifying  functional 
goals.  These  criteria  could  include  the  following: 

•  Functional  areas  already  have  been  identified.  Many  large 

systems  are  fuirctionally  decomposed  in  the  original 

procurement  documents  (such  as  required  operational  concepts 
or  purchase  des''riptions)  .  Even  if  system  development  does 
not  necessarily  follow  these  functional  dividing  lines  when 
CSCIs  are  created,  they  are  a  commci'.  point  of  reference  that 
would  allow  a  meaningful  evaluation. 

•  The  system  is  so  diversified  that  wide  ranges  of  quality 
goals  are  reasonable  for  various  system  functions.  A  large 
system  may  have  such  disparate  needs  that  system-wide  goals 
are  not  reasonable. 

•  Tl.;-:  criticality  < '"o:  example,  risk  -o  h.iman  life)  is  such 

that  some  functions  must  have  hig})  gials,  but  cost  factors 
dictate  that  non-critical  iunctiors  do  not  have  the  same 

goals. 

r'^'ac.r’  behind  provi,i.;-ig  this  .-fui'lan.-''.  is  that  an  arbitrary 

tiona!  iei' 0?.: *■  lOn,  •  Dv  a  som'^-  ■  it  arbitrary  requirement 

a :  1 1'cat if.,c, ,  only  exacerbates  the  pr-'-blems  I'l  'icterminino  what  quality 
and  scores  actually  neari  *ol  real  ,.1.1  referents. 

liv-  'juidebcck  should  clearly  s’ate  eh.;.*.  -Ltin.q  system-wide  gcals  is 
.  e  •■co.iabi e  and  a,  ..ept-abL'' .  li.  -c -,ses  ■•■jieLe  pact  cular  goals  are  desired 

ih ;  particular  tn.:ictioi.:-, ,  it  may  r-e  <  Itect  ivc  to  isolate  these  functions 

<:..s  CSCIs  and  set  goals  directly  r.-iated  to  the  CSC!  itself. 

In  addition,  v.e  recoiranend  do-emphasirirg  the  yre-'ess  of  allocation 
ficictional  requirements  among  the  varreus  CSCIs  of  the  system.  We 
recommend  using  this  system  only  when  fur''ti.~ns  been  derived  as 

described  above,  and  then  in  slightly  iiff“’-^nt  manner  than  is 
described  in  the  guidebooks. 

Rather  than  determining  the  pet  (  ‘^•■’.taae  of  ea.ch  ''  c  t  '--nt  r  ibutes  to 
each  function,  we  recommend  list;"''i  'for  ea -h  '  p'  tII  CSCIs  that 

implement  it.  T'ne  scores  calculcted  for  f'v:'  .'SCI  .-.o’lld  then  be 
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averaged  to  calculate  the  functional  score,  without  taking  into 
consideration  how  much  of  the  CSCI  actually  is  performing  a  given 
function.  This  will  reduce  some  of  the  arbitrary  nature  of  the 
allocation  process,  in  order  to  incr-ase  the  comparability  of  results 
across  projects.  This  will  also  !  tild  information  so  that  we  may 
define,  in  terms  of  actual  systems  b  'r  and  used,  what  are  calculated 
quality  numbers  mean  (e.g.,  etrors  per  delivered  lines  of  code,  time  to 
upgrade,  time  to  transport). 


4.3.6  Violation  Procedures 


See:  Guidebook  Volume  II,  Paragraph  4.1.4,  "Review  Metric  Scores" 


The  goal  specif icati-:  guidebc''k  shou:  ■  ::.-nto:r.  information  on  how  the 
acquisition  manager  r-i  handle  ivin-ccmuliance  with  metric  element 

questions,  and  wit!',  'i.tri.'a  -ir,  '  ta  t'w  score.-:;  -'’h.’ch  fall  bf^l'w  his 
goals.  Along  with  tn-'  .jre'ii’  ation  t  the  qualit.'  'tself,  "lift  manager 
needs  to  outline  what  vill  :  <■  d  ;ne  .•.lie:',  those  s| m  1  £j  raticns  are  not 
met.  To  do  this,  he  cr.adan'.  o  firm-  the  nethod^lrgy  a!'-,  --,t  'what 

procedures  aie  best  may  be,  fci  example,  that  a  software  problem 

report  is  to  be  generated  for  every  "0"  or  "nc"  answer  te  a  metric 
question  on  worksheets  0,  1,  2,  3A.,  and  4A.  For  .some  question.!  on  3A 
and  4A,  it  may  be  that  a  threshold  figure  should  be  specified  against 
which  violations  must  be  written  up  as  problemi  reports.  Some  projects 
have  demanded  a  problem  report  fcr  ev-^ry  violation  in  an  effort  to  try 
to  force  the  quality  scores  to  a  value  of  1.  This  is  probably  not 
desirable,  but  guidance  should  be  supplied.  If  the  goal  is  to  assess  -- 
but  not  repair  —  violations,  this  should  be  specified  by  the 
acquisition  manager. 


Independent  verification  and  validation  contractors  could  be  of 
assistance  to  the  acquisition  manager  in  this  regard.  The  experience 
and  knowledge  of  these  analysts  could  be  used  to  help  determine  what  is 
critical  to  the  development  and  help  to  define  procedures  to  ensure  that 
violations  are  handled  properly.  The  acquisition  manager  could  use  this 
information  to  establish  the  needed  procedures. 


4.3.7  Scaling  of  Scores 


See:  Guidebook  Volume  II,  Paragraph  4.4.2,  "Review  Factor  Scores" 


The  acquisition  manager  reviews  quality  scores  at  review  points 
throughout  the  development  process.  He  examines  the  scores,  using  a 
range  of  .9  to  1  as  Excellent,  of  .8  to  .89  as  Good,  and  of  .7  to  .79  as 
Average. 


Lacking  experimental  evidence  as  to  what  the  measured  scores  really 
reflect  (as  yet  they  are  only  relative),  we  believe  that  this 
relationship  and  numerical  range  may  be  too  limited.  For  some  metric 
questions  (those  that  are  measured,  for  example,  es  '  d',  each  change  in 
the  value  can  produce  a  wide  variation  in  result  (for  example,  from  1  to 
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.5  to  .33).  This  variation  reduces  as  the  number  grows  large,  but  by 
then  the  score  is  well  below  even  the  .7  to  .79  "average"  range.  We  do 
not  yet  really  know  if  there  is  a  meaningful  difference  between  a  score 
of  .8  and  .9,  nor  do  we  know  what  the  "unacceptable"  cut-off  level 
should  really  be. 

From  analyzing  the  goals  set  by  the  decision  aid  developers  themselves, 
we  also  believe  there  is  a  tendency  to  set  quality  goals  higher  than 
they  really  need  to  be.  The  developers  indicated  several  factors  should 
have  "excellent"  values  (which  were  not  achieved),  which  we  believe  to 
be  above  those  really  needed  for  such  a  prototype  system. 

The  result  of  this  is  that  we  believe  that  more  information  needs  to  be 
supplied  to  the  acquisition  manager  to  allow  him  to  effectively  evaluate 
the  achieved  scores,  their  ranges,  and  what  those  ranges  actually  mean. 

4.3.8  Criteria  weighting 

See:  Guidebook  Volume  II,  Paragraph  4.2.2,  "Assign  Weighting  Formulas" 

The  process  of  weighting  criteria  is  described  in  the  SQM  guidebook 
during  the  qualify  goal  specification  process.  Each  criteria  making  up 
the  selected  quality  factors  is  evaluated,  and  weightings  assigned  to 
each  (determining  its  percentage  of  the  total  factor  score). 

This  process  is  very  arbitrary  and  subjective.  An  acquisition  manager 
has  no  theoretical  justification  for  weighting  any  criteria,  because  we 
do  not  yet  have  all  the  information  needed.  We  cannot  currently  point 
t-o  a  real-world  meaning  for  the  calculated  values  we  create,  but  we  do 
know  that  that  is  the  direction  we  would  like  to  move.  We  would  like  to 
be  able  to  say  that  a  system  with  a  MAINTAINABILITY  score  of  .98,  for 
example,  is  going  to  be  relatively  inexpensiv'’  to  maintain  (especially- 
compared  to  a  system  with  a  score  of  .45).  If  acquisition  managers 
arbitrarily  assign  weights  to  the  criteria  that  sake  up  maintainability, 
then  we  have  no  way  to  compare  scores. 

If  these  scores  are  not  meant  to  be  us-.-  i  for  comparison  purposes,  then 
why  make  a  calculation  at  all?  It  would  be  loaical,  in  that  case,  to 
use  the  metric  worksheets  as  ch.ecklists  only.  Creating  a  "score"  would 
not  provide  any  further  information  for  the  developer  or  the  acquisition 
manager.  It  would  allow  no  relative  assessment  of  a  systems  quality. 

Until  we  have  collected  the  data  to  validate  the  numbers  we  are 
calculating  (i.e.,  what  does  a  .93  score  for  CORRECTNESS  mean?),  SAIC 
recommends  that  the  weighting  process  be  droc>p‘^d  altogether ,  Rather 
than  assigning  arbitrary  weigt  ,  v/e  recommend  that  the  acquisition 
manager  be  instructed  to  cal  • factor  scores  by  averaging  the 
applicable  criteria  scores.  Thi  cnproach  is  already  used,  though  not 
commented  upon,  in  calculating  the  criteria  ang  metric  scores 
themselves.  We  do  not  currently  weight  metiie,  oiements,  or 
metric  element  questions  when  they  are  grouped  and  scored.  As  data 


validates  the  software  quality  measurement  methodology  results,  we 
believe  that  we  will  be  able  to  derive  weightings  that  should  be  used  at 
all  these  levels,  including  criteria.  These  weightings  will  allow  us  to 
make  objective  assessments  of  quality  that  apply  in  the  same  fashion  to 
most  application  systems. 

In  the  place  currently  occupied  by  weighting  criteria,  we  recommend 
supplying  procedures  that  substitute  for  the  weighting  concept.  Any 
non-appli cable  or  non-desired  criteria  would  be  dropped  from  measurement 
and  their  respective  metric  elements  left  unscored.  An  example  of  this 
would  be  in  eliminating  a  criterion  like  virtuality,  used  chiefly  in 
network  applications. 

For  factor  criteria  that  are  of  unequal  importance  to  the  acquisition 
manager,  procedures  could  be  :  .eluded  to  allow  violations  (i.e.,  low 
criteria  scores)  to  be  handled  in  a  different  fashion  for  those 
criteria.  If  effectiveness-processing  (EP)  is  far  more  important  to  the 
manager  than  is  effectiveness-communication  (EC),  then  the  results  of 
scoring  each  of  those  criteria  could  be  handled  differently.  A  low 
score  achieved  concerning  communication  would  be  specified  as 
acceptable,  while  a  low  score  for  the  processing  measure  would  require 
corrective  measures.  In  this  way,  the  important  criteria  are  still 
regarded  as  vital,  but  do  not  contribute  to  making  factor  scores 
inconparable  across  systems. 

4.3.9  Survey  Questionnaires 

See;  Guidebook  Volume  II,  Paragraph  4. 1.2. 3,  "Quality  Requirements 
Survey" 

Quality  survey  questionnaires  are  sent  out  by  the  acquisition  manager  to 
collect  information  for  quality  score  specification.  When  responding, 
each  person  surveyed  lists  the  system  functions,  and  then  inserts  the 
quality  scores  he  would  set  for  each  function. 

SAIC  recommends  that  the  functions,  if  they  are  to  be  used  at  all,  be 
specified  by  the  acquisition  manager  and  iiicluded  on  the  surveys.  In 
addition,  the  forms  in  the  guidebook,  shown  as  examples,  do  not  have 
space  set  aside  for  the  respondent  to  indicate  factor  quality  scores. 
We  recommend  adding  specific  room  for  these  answers. 

4.4  Quality  Assessment  Recoonendations 

Recorranendations  in  this  section  apply  to  Volume  III,  Software  Quality 
Evaluation  Guidebook  [BOE-3]. 
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4.4.1  Data  Collection  Workbook 

See:  Guidebook  Volume  III,  Paragraph  4.2.3,  "Answer  worksheet 
Questions” 

Volume  III  of  the  guidebooks,  in  paragraph  4.2.3,  discusses  how  the 
worksheets  are  to  be  filled  out  for  metric  evaluation.  The  section  is 
approximately  one  and  one-half  pages  long,  and  briefly  describes  the 
layout  of  the  worksheets,  who  should  answer  the  questions,  how  to 
identify  source  material,  and  v^at  to  do  about  all-inclusive  questions. 
It  mentions  that  results  should  be  reproducible,  and  that  any  judgments 
made  and  metric  violations  noted  should  be  documented. 

There  are  no  detailed  explanations  or  recommendations  concerning  exactly 
how  the  worksheet  questions  are  to  be  answered.  We  recommend  that  such 
details  be  included  in  order  to  promote  a  'jniform  approach  to  software 
measurement  technology.  This  uniformity  will  allow  results  to  be 
gathered  across  various  projects  that  are  comparable.  It  will  reduce 
the  perception  in  industry  and  the  academic  community  that  this 
technology  is  arbitrary,  and  allow  research  to  proceed  that  can 
correlate  quality  measurements  to  actual  system  performance.  In 
addition,  such  details  would  supply  guidance  to  allow  the  technology  to 
be  most  effectively  and  efficiently  used. 

The  text  should  reflect  the  fact  that  it  is  not  the  best  approach  to 
simply  sit  down  with  a  document  (or  source  code),  get  the  worksheet,  and 
answer  each  question  in  the  order  presented.  The  worksheets  are  not 
really  worksheets,  but  are  simply  a  list  of  applicable  questions 
presented  in  order  by  mnemonic.  This  listing  is  important  and  useful, 
but  data  should  not  be  collected  in  the  same  manner  as  the  list  is 
presented.  This  is  because  the  worksheet  questions  are  repetitive  and 
interrelated.  The  questions  are  valid,  but  data  can  be  gathered  much 
more  efficiently  and  effectively  if  another  approach  is  taken. 

The  SQM  guidebook  should  recommend  an  approach  that  is  useful  and 
cost-effective,  minimizing  manual  effoi t  and  maximizing  the  data 
gathered.  Automated  tools  are  an  obvious  direction  to  take,  and  the 
methodology  does  not  indicate  how  to  take  advantage  of  these  tools. 

One  way  to  aid  the  user,  whether  he  has  access  to  tools  or  only  to 
manual  data  collection,  is  to  categorize  the  questions.  Various  types 
of  categories  would  be  useful  to  the  question-answering  process. 
Potential  category  methods  are  discussed  below: 

•  Data  necessary  to  answer  the  question 

•  Skill  level  required  to  an-  .u  ‘he  question 

•  Techniques  for  answering  the  iguestion 
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Data  necessary  to  answer  the  question.  The  SQM  methodology  was  designed 
to  be  in  accord  with  the  software  life  cycle  as  described  DOD-STD-2167 . 
This  life  cycle  has  defined  products  and  reviews  that  are  recommended 
for  each  phase  of  development  and  deployment.  The  Data  Item 
Descriptions  (DIDs)  that  correspond  o  these  products  mandate  certain 
information  for  each  of  the  products  created.  One  categorization  method 
would  be  to  list  the  correspondence  between  question  source  material 
(the  location  of  where  the  answer  should  be  in  a  standard  DOD-STI>-2167 
type  development)  and  the  questions.  An  example  of  this  is  question 
SI.1(9)  in  Worltsheet  0;  Are  there  requirements  for  a  programming 
standard?  The  Software  Development  Plan  (DI-MCCR-80030) ,  in  a 
development  following  DOD-STD-2167,  should  contain  this  information  in 
paragraph  5.1.6.  Knowing  where  this  information  should  be  located  would 
greatly  aid  in  either  automated  or  manual  data  collection. 


A  second  aspect  of  knowing  the  source  of  data  necessary  to  answer  each 
question  involves  looking  at  the  minimum  amount  of  data  required  to 
answer  all  questions.  As  mentioned  before,  the  worksheet  questions  are 
not  totally  independent  of  each  other.  Some  questions  are  word-for-word 
repetitions  of  other  questions,  and  some  questions  use  data  that  is  also 
used  to  answer  other  questions.  This  is  not  a  defect,  but  reflects  the 
reality  that  the  criteria  and  factors  we  measure  are  not  (and  should  not 
be)  orthogonal  to  each  other. 


It  does  mean,  however,  that  answering  each  worksheet  question  in  order 
may  involve  doing  the  same  work  over  and  over.  To  avoid  this,  it  would 
be  easy  to  create  a  list  of  the  minimum  data  set  required  to  answer  ail 
questions  for  each  worksheet. 


Skill  level  required  to  answer  the  question.  The  questions  on  the 
various  worksheets  are  not  all  equivalent  in  terms  of  the  background  and 
experience  necessary  for  an  analyst  to  quickly  and  effectively  answer 
them.  The  most  senior  personnel  could,  of  course,  answer  all  questions. 
However,  some  questions  are  so  straight  forward  as  to  lend  themselves 
easily  to  either  automation  or  to  evaluation  by  much  less  experienced 
analysts.  It  is  more  cost-effective  to  use  these  techniques  where 
possible,  and  to  focus  more  senior  effort  on  the  areas  where  it  is 
important.  Table  4.4-1  presents  our  analysis  of  the  skill  level 
required  for  the  various  questions  on  Worksheet  4B.  The  category 
"junior  analyst"  refers  to  people  with  1  to  4  years  experience  in  such 
things  as  documentation,  requirements  analysis,  and  configuration 
management.  The  "senior  analyst"  has  5  to  15  years  experience  in  the 
seune  sort  of  tasks.  "Junior  programmers"  are  design  and  implementation 
personnel,  typically  with  1  to  5  years  experience.  The  "senior 
progreuraners"  are  highly  experienced  in  system  requirements,  design, 
coding,  and  testing. 


The  skill  level  analysis  was  required  for  our  evaluation  of  Worksheet  4B 
only.  Earlier  worksheets  were  such  that  non-programmers  could  easily 
evaluate  nearly  all  of  the  questions.  Examination  nnri  understanding  of 
the  code,  however,  required  personnel  experienced  in  programming.  Based 
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TABLE  4.4-1 


SKILL  LEVEL  REQUIRED  TO  ANSWER  QUESTIONS  ON 

WORKSHEET  4B 


QUESTION 

ANALYST 

JUNIOR 

ANALYST 

SENIOR 

ANALYST 

■MQSEQQHI 

SENIOR 

PROGRAMMER 

AM.1(3) 

X 

AM.2(7) 

X 

AP.l(l) 

X 

AP.2(1) 

X 

AP.2(2) 

X 

AP.2(3) 

X 

AP.2(4) 

X 

AP.3(2) 

X 

AP.4(1) 

X 

AT.l(l) 

X 

AT.2(1) 

X 

AT.2(2) 

X 

CP.1(2) 

X 

CP.  1(4) 

X 

CP.  1(9) 

X 

CP.UIO) 

X 

CS.1(2) 

X 

CS.1(3) 

X 

CS.1(4) 

X 

CS.1(5) 

X 

CS.2(1) 

X 

CS.2(2) 

X 

CS.2(3) 

X 

CS.2(6) 

X 

EP.1(2) 

X 

EP.1(3) 

X 

EP.1(4) 

X 

EP.1(6) 

X 

EP.2(4) 

X 

EP.2(5) 

X 

EP.2(7) 

X 

ES.U6) 

X 

FS.l(l) 

X 

FS.1(2) 

X 

GE.2(2) 

X 

GE.2(3) 

X 

GE.2(4) 
ID.Kl) 
ID.1(3) 
M0.1(3) 
MO.  1(4) 
MO.  1(5) 

X 

X 

X 

X 

X 

X 

.'I 


Km 


TABLE  4.4-1  (CONT) 


SKILL  LEVEL  REQUIRED  TO  ANSWER  QUESTIONS  ON 

WORKSHEET  4B 


QUESTION 

ANALYST 

JUNIOR 

SENIOR 

JUNIOR 

SENIOR 

ANALYST 

ANAI  VS  ! 

PROGRAMMER 

PROGRAMMER 

M0.1(6) 

X 

MO.  1(7) 

X 

MO.  1(8) 

X 

MO.  1(9) 

X 

M0.2(5) 

X 

SD.KD 

X 

SD.2(1) 

X 

SD.2(2) 

X 

SD.2(3) 

X 

SD.2(4) 

X 

SD.2(5) 

X 

SD.2(6) 

X 

SD.2(7) 

X 

SD.2(8) 

X 

SD.3(1) 

X 

SD.3(2) 

X 

SD.3(3) 

X 

SD.3(4) 

X 

SD.3(5) 

X 

SD.3(6) 

X 

SI.1(2) 

X 

SI.1(3) 

X 

SI.1(4) 

X 

SI.1(5) 

X 

SI.3(1) 

X 

SI.4(1) 

X 

SI.4(2) 

X 

SI.4(3) 

X 

SI.4(4) 

X 

SM(5) 

X 

SI.4(6) 

X 

SI.4(7) 

X 

SI.4(8) 

X 

SI.4(9) 

X 

SI.4(10) 

X 

Sr.4(ll) 

X 

SI.4(12) 

X 

SI.4(13) 

X 

SI.5(1) 

X 

SI.5(2) 

X 

SI.5(3) 

X 
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TABLE  4.4-1  (CONT) 


SKILL  LEVEL  REQUIRED  TO  ANSWER  QUESTIONS  ON 

WORKSHEET  4B 


QUESTION 

ANALYST 

JUNIOR 

ANALYST 

SENIOR 

ANALYST 

JUNIOR 

PROGRAMMER 

SENIOR 

PROGRAMMER 

SI.6(1) 

X 

ST.l(l) 

X 

ST.1(2) 

X 

ST.  1(3) 

X 

ST.  1(4) 

X 

ST.  1(5) 

X 

ST.2(1) 

X 

ST.2{2) 

X 

ST.2(3) 

X 

ST.2(4) 

X 

ST.2(5) 

X 

ST.  3(3) 

X 

ST.4(I) 

X 

ST.4(2) 

X 

.ST.4(4) 

X 

ST.4(5) 

X 

ST.5(1) 

X 

ST.5(2) 

X 

.ST.5(3) 

X 

ST.5(4) 

X 

VS. 1(1) 

X 

VS.  1(2) 

X 

I 


on  our  experience,  however,  we  recommend  that  early  worksheets  be 
evaluated  by  more  senior  personnel,  and  that  the  later  worksheets  can 
mostly  be  evaluated  automatical Iv  and  by  more  junior  programming 
personnel-  The  reason  for  using  r,-'-  c'  personnel  early  is  to  add  their 
experience  and  knowledge  to  tb‘  pi  eta' ion  of  the  system  and  of 

metric  questions.  The  decision^  r*  '•  early  in  the  life  of  a  project 
have  great  impact  on  its  cost  and  e‘ rectiveness.  Later  worksheets  are 
of  course  important,  but  errors  in  judgment  at  that  point  are  less 
costly  to  correct  than  are  c’-rors  made  during  requirements  analysis. 

There  are  several  implications  for  listi.ng  the  skill  level  required  to 
answer  each  of  the  questions  on  each  of  the  worksheets.  In  general,  the 
greater  the  skill  or  experience  required  to  answer  a  metric  question, 
the  more  difficult  it  will  be  to  automate  that  question.  Automation  is 
an  important  cost  reduction  tool  for  the  software  quality  measurement 
methodology,  and  can  be  used  in  conjunction  with  understanding  of 
experience  needed.  In  addition,  experience  is  needed  to  aid  in  the 
evaluation  of  the  impact  of  a  "No"  or  "0"  answer  to  a  metric  question. 
This  experience  can  be  used  to  make  recommendations  as  to  corrections 
and  effort  that  should  be  devoted  to  a  particular  problem.  The  mote 
subjective  metric  questions  should  be  analyzed  by  the  more  senior  and 
experienced  staff,  in  order  to  best  use  the  talent  available.  This  is 
particularly  true  in  the  earlier  worksheets  (Worksheets  0,  1,  and  2). 
The  later  worksheets  ( 3a  &  3B,  and  4A  &  4B)  are  more  easily  automated, 
and  tend  to  contain  questions  that  ate  more  explicit  and  of  a  "counting" 
nature . 

Techniques  for  answering  the  question.  The  methods  used  to  answer 
questions  lends  itself  very  easily  to  establishing  some  basic  catgories. 
The  types  of  categories  we  recommend  include: 

•  Counting.  Some  metric  questions  may  be  answered  by  a 

straight-forward  counting  of  such  things  as  lines  of  code, 
lines  of  comments,  nesting  level,  and  data  references.  These 
in  general  require  little  decision-making,  and  may  be 
con^jleted  in  one  pass  through  a  document  section  or  unit  of 
source  code. 

•  Understanding.  Some  questions  require  an  analyst  to  read  and 
understand  material,  in  order  to  decide  if  the  material  is 
clear,  complete,  logically  indented,  etc. 

RECXDfWEUDAinON  —  These  types  of  categorization  and  analysis  methods  are 
all  reflected  in  the  workbook  approach  recommended  by  SAIC.  The 
workbook  would  be  a  collection  of  true  worksheets,  with  each  worksneet 
indicating  the  categories  and  information  discussed  above  in  this 
paragraph. 


Figure  4.4-1  is  a  sample  of  how  some  of  this  workbook  should  look.  The 
workbook  would  indicate  where  in  particular  d'^cume’-'ts  data  items  are 
expected  to  be  found.  It  should  also  indicate  the  analyst  level  of 


personnel  needed  to  collect  and  evaluate  the  data.  This  would  eliminate 
a  great  deal  of  search  time.  It  would  also  indicate  the  skill  level  of 
the  analyst  needed  to  complete  each  of  these  new  worksheets,  and 
indicate  techniques  for  data  collection  and  evaluation. 

The  workbook  would  indicate,  for  sections  of  each  source  document,  what 
questions  are  to  be  answered,  what  data  collected,  and  who  can  best 
collect  the  data.  An  exanple  of  this  is  the  data  collected  for  each 
"described  function."  The  analyst  would  examine  each  function,  and  note 
whether  it  is  defined  completely,  what  its  name  is,  and  what  references 
to  it  exist.  All  this  information  would  be  used  to  answer  metric 
questions  on  the  kind  of  worksheets  currently  given  in  Volume  III  of  the 
guidebooks.  The  analyst  evaluating  the  specification  would  be  aware  of 
exactly  what  data  is  needed,  and  where  it  is  to  be  used  on  each 
worksheet. 

In  addition,  the  workbook  could  contain  areas  that  allow  the  evaluator 
to  add  comments  and  information  on  any  software  problem  reports 
generated.  Dates  of  problem  report  submittal  and  resolution  could  also 
be  included. 

The  workbook  approach  is  also  very  efficient  at  gathering  data,  even 
using  manual  methods.  The  analyst  performing  the  evaluation  is  not 
forced  into  repetitive  examination  of  material,  but  evaluates  it  in  a 
meaningful  fashion  with  as  few  iterations  as  possible.  Another  main 
strength  lies  in  the  repeatability  of  the  process.  Currently,  no  way 
exists  to  verify  a  metric  score,  because  no  data  is  retained  which 
supports  any  conclusions.  Using  a  formal  data  collection  workbook  would 
provide  a  means  of  retaining  the  data,  allowing  verification,  and  also 
allowing  the  recalculation  of  scores  based  on  document  updates. 

These  workbooks  could  be  tailored  (for  reduced  sets  of  factors, 
criteria,  or  metrics)  just  as  the  current  worksheets  can  be  tailored. 
Data  that  is  not  needed  for  any  metric  elements  would  not  be  collected. 
To  support  this  tailoring,  the  workbook  should  include  references  to 
applicable  metric  elements  for  each  data  item  to  be  collected. 

4.4.2  Scoring  Worksheets  3B  and  4B 

See:  Guidebook  Volume  III,  Appendix  .A,  "Metric  Worksheets" 

The  SQM  methodology'  specifies  t)'at  one  copy  of  Worksheets  3B  and  4B  is 
to  be  completed  for  each  software  unit.  A  'unit  is  defined  in 
OOD-STD-2168  as  the  smallest  logical  entity  specitied  in  the  detailed 
design  -which  completely  describes  a  single  f'jncti'cn  in  sufficient  detail 
to  allow  implementing  code  t  ■  “  -produced  and  ‘-as'''^d  independently  0^ 
other  units.  The  definitior-  '  ■'  'ipplies  to  *  b'’  nints  as  th®  actual 
physical  entities  implementeii  code.  T’--  ■■■o'ksheet  c':n''.text, 
this  definition  seems  to  be  ■.•••hat  is  meant  "unit." 
Measurements  are  made  against  each  unit,  thop  :  .'.u  tSCl  s"'oies. 
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The  adveuitage  of  completing  a  separate  Worksheet  3B  and  4B  for  each  unit 
of  the  code  and  design  lies  in  unit  independence.  Worksheet  answers  may 
be  easily  separated  and  distributed  to  the  authors  or  evaluators  of  each 
software  unit.  There  is  a  major  disadvantage,  however,  in  the  bulk  of 
the  paper  involved. 

In  all  but  the  smallest  systems,  the  evaluation  of  Worksheets  3B  and  4B, 
as  they  currently  exist,  would  create  a  mountain  of  paper.  As  an 
example,  consider  the  size  of  the  Enemy  Course  of  Action  Evaluation  Aid 
(ECQAEA)  and  the  Enemy  Sortie  Capability  Measurement  Aid  (ESCMA) 
systems.  Each  of  these  systems  is  small.  The  ECCAEA  consists  of  29 
files  having  150  subroutines  or  units.  The  ESCMA  consists  of  139 
programs  and  procedures. 

If  each  unit  were  documented  in  the  detailed  design,  then  the  10  pages 
of  Worksheet  3B  and  13  pages  of  Worksheet  4B  would  be  completed  for  each 
■unit.  The  result  of  this  would  be  approximately  10,800  pages  just  to 
record  metric  element  answers.  On  a  large  system  project,  this  number 
could  quickly  become  astronomical. 

For  this  reason,  we  recommend  that  the  SQM  methodology  be  modified. 
This  recommendation  concerns  only  a  format  change  for  Worksheets  3B  and 
4B.  If  the  worksheets  were  modified  to  a  tabular  form,  the  amount  of 
paper  generated  could  be  drastically  reduced.  In  addition,  the  process 
of  scoring  each  CSCI  would  be  made  easier. 

As  an  example  of  the  reduction  in  paper  bulk,  the  following  tables  are 
proposed  for  Worksheets  3B  and  4B.  The  modified  Worksheet  4B  consists 
of  10  pages,  and  answers  for  15  units  may  be  contained  on  each  page. 
This  means  that  333  pages  are  required  to  hold  answers  for  500  modules 
(500  modules  divided  by  15  modules  per  page  multiplied  by  10  pages  per 
worksheet).  Under  the  current  methodology,  ■:,500  pages  (13  pages  of 
Worksheet  4b  multiplied  by  500  modules'  would  be  required.  This  is 
better  than  a  19:1  reduction.  For  Worksheet  3E  8  pages  were  used  for 
the  tables.  For  both  worksheets  together,  the  page  count  for  500 
niCKlules  is  600  pages  using  the  modified  rabies,  and  11,500  using  the  SQM 
method. 

Fig-ure  4.4-2  is  a  sample  of  these  worksheet  an.vwei  tables. 

These  modified  scoring  worksheets  “an  be  used  with  the  data  collection 
workbook  described  in  paragraph  abc-ve.  The  wrrkbooks  .ecord  the 
original  information  needed  to  answo.  t)ie  •worksheet  cniest  i  ons ,  but  are 
not  themselves  directly  scorable.  rhes"  scoring  -w.' sheets  •wru.ld  be 
used  in  conjunction  with  the  workooo*;  in  order  iro'/ide  a  place  to 
calculate  and  record  the  scores  berh  Worksheets  IF-  'ind  4B. 

4.4.3  Metric  Worksheets  Errors 

See:  Guidebook  Volume  III,  Appendix  A,  "Met'’'-  ■■  ■  ' 


Some  metric  questions  in  the  worksheets  have  typographical  and  other 
formatting  errors  in  their  text.  These  questions  are  identified  and 
described  below. 

SOCSUNG  ERRORS 

Many  of  the  questions  on  Worksheets  3B  and  4B  are  answered  by  either  a 
"yes"  or  a  "no"  response.  It  is  then  appropriate  for  Worksheets  3A  and 
4A  to  total  the  number  of  "yes"  responses.  Occasionally,  Worksheet  3A 
or  4A  incorrectly  asks  the  analyst  to  "add  applicable  unit  scores,"  as 
shown  below: 

WORKSHEET  3A  or  4A  incorrect  sample: 

a.  How  many  applicable  units  (score  entered  on  3B 
(or  4Bl )? 

b.  What  is  the  total  score  for  all  applicable  units  (add 
applicable  unit  scores  from  3B  (or  4B])? 

c.  Calculate  b/a  and  enter  score. 

This  should  be  corrected  to: 

a.  How  many  applicable  units  (answer  of  Y  or  N  on  3B 
(or  4B])? 

b.  How  many  units  with  answer  of  Y  (see  3B  (or  4B])? 

c.  Calculate  b/a  and  enter  score. 

This  correction  applies  to  the  following  questions: 

Worksheet  3A,  FS.l(l) 

For  other  Worksheet  3A  and  4A  questions,  the  reverse  situation  occurs. 
Many  of  the  cjuestions  on  Worksheets  3b  and  4B  result  in  calculated 
scores  between  0  and  1.  For  some  of  these.  Worksheets  3A  or  4A 
incorrectly  ask  the  analyst  to  count  the  number  of  "units  with  an  answer 
of  Y,"  as  shown  below: 

Worksheet  3A  or  4A  sample  incorrect  version: 

a.  How  many  applicable  units  (answer  of  Y  or  N  on  3B 
(or  4B])? 

b.  How  many  units  with  answer  of  Y  (see  3B  (or  4B])? 

c.  Calculate  b/a  and  enter  score. 

This  should  be  corrected  to  read: 

a.  How  many  applicable  units  (score  entered  on  3B 
(or  4B])? 

b.  V^at  is  total  score  for  all  applicable  units  (add 
applicable  unit  scores  from  3B  (or  4b1)? 

c.  Calculate  b/a  and  enter  score. 

This  correction  applies  to  the  fc '.lowing  questions: 


rfi flk  lUil 


O' 


Worksheet  3A,  SI.6(1) 
Worksheet  4A,  M0.1(4) 
Worksheet  4A,  SD.3(4) 
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Worksheet  4A, 
Worksheet  4A, 
Worksheet  4A, 
Worksheet  4A, 
Worksheet  4A, 


SI. 4(7) 
SI. 4(8) 
SI. 4(9) 
SI. 4(10) 
SI. 4(11) 


Wbrksheet  3A,  SI.6(1) 


On  Worksheet  3B,  this  question  is: 

d.  How  many  unique  operations? 

e.  How  many  unique  operands? 

f.  How  many  total  operands? 

g.  Calculate  1  -  (2xe)/(dxf)  and  enter  score. 


Part  (a)  should  contain  "operators",  not  "operations". 

Worksheets  4A  and  4B,  EP.1(3) 


Worksheet  4A  contains  the  question  as: 

a.  How  many  applicable  units  (score  entered  on  4B)? 

b.  What  is  total  score  for  all  applicable  units  (add 
applicable  unit  scores  from  4B)? 

c.  Calculate  b/a  and  enter  score. 


Worksheet  4B  contains: 

d.  How  many  units  are  required  to  be  optimized  for 
processing  efficiency? 

e.  How  many  units  are  optimized  for  processing  efficiency 
(i.e.,  compiled  using  an  optimizing  compiler  or  coded  in 
assembly  language)? 

f.  Calculate  l-(e/d)  and  enter  score. 


The  question  should  be  eliminated  entirely  from  Worksheet  4B. 


The  Worksheet  4A  question  should  be  rewritten.  In  addition,  the 
calculated  score  should  have  been  e/d,  not  l-(e/d)  as  shown  in 
Worksheet  4B.  Worksheet  4A  should  then  contain: 

a.  How  many  units  are  required  to  be  optimized  for  processing 
efficiency? 

How  nuuny  units  are  optimized  for  processing  efficiency 
(i.e.,  compiled  using  ein  optimizing  conpiler  or  coded  in 
assembly  language)? 

Calculate  a/b  and  enter  score. 


b. 


c. 


Worksheet  4A,  ES.1(7) 


The  question  states: 

a.  How  many  total  software  units? 

b.  How  many  software  units  are  optimized  for  storage 

efficiency _ 

c.  Calculate  l-(b/a)  and  enter  score. 
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This  results  in  the  score  being  lower  the  closer  the  system  matches 
its  requirements.  Part  (c)  should  be: 
c.  Calculate  b/a  and  enter  score. 

Worksheet  4A,  FS.l(l) 

The  score  "box"  for  recording  the  answer  to  part  (c)  of  this 
question  contains  "  Y  N  N/A".  It  should  contain  room  for  a 
numerical  answer,  followed  by  "N/A". 

worksheet  4A,4B  10.2(2) 

This  question  does  not  appear  on  Worksheet  4B.  It  states  on  4A: 

a.  How  many  units  in  the  CSCI? 

b.  How  many  units  in  the  CSCI  perform  external  input/output? 

c.  Calculate  l-(b/a)  and  enter  score. 

The  question  should  be  redone  so  that  it  does  appear  on  4A  as: 

a.  How  many  applicable  units  (answer  of  Y  or  N  on  4B)? 

b.  How  many  units  with  answer  of  Y  (see  4B)? 

c.  Calculate  l-(o/a)  and  enter  score. 

Worksheet  4B  should  then  contain: 

d.  Does  this  unit  perform  external  input/output? 

Worksheet  4A,4B  ID. 2(3) 

This  question  does  not  appear  on  Worksheet  4B.  It  states  on  4A: 

a.  How  many  units  in  the  CSCI? 

b.  How  many  units  in  the  CSCI  contain  operations  dependent 

on  word  or  character  size? 

c.  Calculate  l-(b/a)  and  enter  score. 

The  question  should  be  redone  so  that  it  does  appear  on  4A  as: 

a.  How  many  applicable  units  (answer  of  Y  or  N  on  4B)? 

b.  How  many  units  with  answer  of  Y  see  4B)? 

c.  Calculate  l-(b/a)  and  enter  score. 

Worksheet  4B  should  then  contain: 

d.  Does  this  unit  contain  operations  dependent  on  word 
or  character  size? 

Worksheet  4A,4B  ID. 2(4) 

This  question  does  not  appear  on  ■•‘■'rksheet  4B.  I '■  states  ^n  'lA: 

a.  How  many  units  in  the  C.-'' 

b.  How  many  units  in  the  C.,  ■  stain  data  ^irm-^nt 

representations  tha  are  ma  liine  depend*^  nr  ? 

c.  Calculate  l-(b'a)  and  ent-- :  r;  -  n'e  . 


The  question  should  be  redone  so  that  it  does  appear  on  4A  as: 

a.  How  many  applicable  units  (answer  of  Y  or  N  on  4B)? 

b.  How  many  units  with  answer  of  Y  (see  4B)? 

c.  Calculate  l-(b/a)  and  enter  see  re. 


Worksheet  4B  should  then  contain: 

d.  Does  this  unit  contain  data  element  representations 
that  are  machine  dependent? 


Worksheet  4b,  AP.3(2) 


This  question  is: 

d.  How  many  lines  of  source  code,  excluding  comments? 

e.  How  many  non-HOL  lines  of  code  excluding  comments? 

f.  Calculate  e/d  and  enter  score. 


This  means  that  the  more  assembly  language  a  routine  contains,  the 
higher  a  score  it  would  get.  100  lines  of  source  code,  30  lines  of 
assembly  language  is  30/100  or  .33  for  the  score.  100  lines  of  score 
code  and  90  of  them  assembly  language  would  get  .90  for  a  score. 
The  reverse  relationship  should  be  shown.  The  question  should  have 
part  (f)  changed  to  read: 

f.  Calculate  1-e/d  and  enter  score. 


4.4.4  Scoring 


See:  Guidebook  Volume  III,  Appendix  B,  "Factor  Scoresheets" 


There  is  a  typographical  error  in  the  scoresheet  used  for  calculating 
the  factor  efficiency.  Figure  4.4-3  presents  the  scoresheet  as 
originally  shown  in  the  SQM  guidebooks.  Figure  4.4-4  shows  how  the 
scoresheet  should  be  corrected. 


4.4.5  Software  Quality  Evaluation  Report 


See:  Guidebook  Volume  III,  Appendix  C,  "Software  Quality  Evaluation 
Report" 


The  Data  Item  Description  recommended  in  the  SQM  methodology  for  the 
Software  Quality  Evaluation  Report  includes  tables  listing  metric 
scores,  compiling  criteria  scores,  and  compiling  factor  scores.  In 
addition,  the  scoresheets  used  to  calculate  scores  for  each  factor  are 
to  be  included. 


SAIC  recommends  eliminating  these  scoresheets  from  the  reports,  and  in 
their  place  adding  one  more  table  to  the  report.  Existing  tables 
already  list  all  scores  for  metrics,  for  criteria,  and  for  factors.  One 
more  table  could  be  included  to  list  scores  for  each  metric  element. 
The  scoresheet  data  would  then  be  covered  by  these  tables,  and  would  not 
need  to  be  included  at  all. 
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FACTOR 
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EC  1(1) 

EP  1(1) 
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The  scoresheets  are  useful  for  calculating  scores  and  should  be 
retained,  but  we  recommend  that  they  also  be  modified.  Rather  than 
scoring  each  factor  as  is  now  done,  we  recommend  scoring  each  criteria, 
factor  scoresheets  would  then  only  show  the  criteria  to  be  used,  and 
could  include  the  acquisition  manager's  weighting  formula  for  each 
criteria  used.  Even  if  no  other  changes  are  made  to  the  scoresheets, 
their  bulk  could  be  reduced  by  putting  more  information  on  each  page. 
Vie  were  able  to  reduce  the  amount  of  paper  by  three  pages  by  simply 
cutting  and  pasting  some  of  the  pages  together.  To  accomplish  this 
shortening,  we  placed  the  factor  INTEGRITY  on  the  same  page  as  the 
factor  EFFICIENCY.  VERIFIABILITY  and  INTEROPERABILITY  were  both 
ccnJensed  from  three  pages  to  two. 


We  also  recommend  havina  a  place  to  put  informaticn  as  needed  on  each 
scoreshee:.  This  infornaticn  could  include  project,  date,  analyst,  and 
CSCI  being  evaluated.  Figure  i.4-5  is  a  sample  of  the  modified  factor 
scoresheet  .might  look.  The  current  scoresheet  format  would  be  used  in 
much  of  its  current  format,  except  that  data  would  be  included  only  to 
the  criteria  level.  Each  criteria  would  then  be  summed  on  the  factor 
scoresheet  page. 


4.4.6  Glossary 


See:  Guidebook  Volume  III,  Appendix  A,  "Metric  Worksheets" 


A  glossary  needs  to  'ne  created  for  each  worksheet  included  in  Appendix  A 
of  Volume  III  of  the  guide-books.  This  glossary  should  contain  all  terms 
used  in  the  worksheet,  except  such  w^ords  as  "the",  "calculate",  etc. 
Every  software-oriented  '■.erm  should  be  completely  defined  and  reflect 
tp.e  way  that  it  is  used  in  the  particular  worksheet.  In  addition,  each 
term  shoui  1  be  used  'uniquely  and  consistently  tor  a  concept  or  item.  As 
an  example,  "data  items"  and  "data  refei'nces"  seem  to  be  used 
interchangeably  on  “he  first  two  worksheets.  ''^nly  one  term  should  be 
used. 


4.4.7  Evaluation  Procedures 


See:  Guidebook  Volume  III,  A^'eriuix  ''Re'ri'-  Woikshoets" 


Steps  to  t-e  used  ac 


juosti:'  and  earn  situation 


-iKoly  to  occur  should  be  defir  -r;  each  ■workshr-'t .  These  srops  would 


likely  be  basical V,'  t!ie  sam-'  aci' 


t h e  we  r  k s ,  with  c n  1  v  some 


specific  guidance  needed 
questions . 


an.'oA'ei  uniq'je 


An  example  of  this  guidance 


1.  For  mult’ pie  part  quest*.  :..:  t.oat  io 
occurrence  to  the  poss.IG;-  r  cfte;  f 


of  an 


4-4  2 


•_/»  •  Ji  j  ■  ■  ■  *  •  ■  •  •  «  •  •  *  «  *  *  •  *  ■  *  #  *  #  *  .  •  -  •  I.  *  to  *  to  •  to  •  ^  .. 


[Example:  a.  Number  of  calling  parameters 

b.  Number  of  calling  parameters  that  are  control 
variables . 

c.  Calculate  b/a  ar 1  enter  score.] 


If  the  answer  to  part  (a; 
for  part  (c)  is  "N/A". 


-ro,  then  the  score  received 


For  "all  or  nothing"  questions,  the  d;'.swei  .must  be  "N"  if 
any  of  the  measured  items  do  nc’i  fully  meet  the  question's 
statement . 

[Example:  Are  all  inputs  docum.ontc-d  as  to  the  specific  use 
and  limitations  of  the  data?] 

If  even  1  input  out  ci  99  is  not  documented,  the  answer  to 
the  example  question  is  "N".  Record  the  "No"  instances  so 
that  the  acquisition  manager  and  system  analysts  may  have 
this  data  available  as  needed.  (The  recommended  workbook, 
Section  4.4.1,  is  a  good  place  for  this  information  to  be 
recorded. ) 

Questions  should  only  be  scored  as  "N/A"  under  two 
circumstances.  If  a  question  or  element  has  been  defined  as 
not  applicable  for  the  entire  system  and  eliminated  from  the 
scoring  process,  it  may  then  be  scored  "N/A".  The  second 
circumstance  is  when  these  procedures  direct  such  a  score 
(see  Procedure  #1,  above). 


4.4.8  EIxaiiples 

See:  Guidebook  Volume  III,  Ppipendix  A,  "Metric  Worksheets" 

While  SAIC  recommends  examples  for  every  step  in  the  methodology,  we  are 
particularly  interested  in  seeing  exampJes  presented  for  each  metric 
question  in  the  worksheets.  These  examples  need  not  be  complex,  but 
should  illustrate  the  kind  of  material  that  the  question  concerns,  and 
the  focus  of  its  statement.  For  Worksheets  3A  &  3B,  and  4A  &4B,  the 
exanples  could  be  directly  related  to  a  small  sample  of  pseudo-language 
used  in  design,  and  actual  code  used  in  constructing  the  equivalent 
routine  or  unit,  respectively.  For  other  questions,  a  simple  statement 
of  the  types  of  information  that  would  allow  a  user  to  determine  an 
answer  to  the  question  would  be  very  helpful. 

These  examples  would  not  be  included  in  each  worksheet  itself,  but  in  a 
separate  set  of  material  that  could  be  referenced  along  side  of  each 
worksheet  during  evaluation.  The  examples  could  also  reference  and 
include  the  workbooks  (see  Section  4.4.1)  rec^^mmended  for  data 
collection. 
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[PAR-2] 
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DOD-STD-2167 .  Military  Standard,  Defense  System  Software 
Development.  4  Jvme  1985. 

Standard  7935. 1-S.  Department  of  Defense,  AutCTiated  Data 
Systems  Documentation  St^dardFi  TS  September  1977. 

Bowen,  Thomas  P.,  Wigle,  Gary  B.,  and  Tsai,  Jay  T. 
Specification  of  Software  Quality  Attributes.  Vol  I. 
R^-TR-85-37.  AD-A153988.  February,  1985. 

Bowen,  Thomas  P.,  wigle,  Gary  B.,  and  Tsai,  Jay  T. 
Specification  of  Software  Quality  Attributes  —  Software 
Q^Ulity  Specif  Ration  Guidebook .  Vol  iTI  RADC-TR-85-37 . 
AD-a15398^  February,  1985. 

Bowen,  Thomas  P.,  Wigle,  Gary  B.,  Tsai,  Jay  T.  Specification 
of  Software  Quality  Attributes  —  Software  Quality  Evaluation 
Guidebook.  Vol  III.  RADC-TR-85-3T;  a5=A153990 T  February, 


Par  Technology.  Selected  Pays  from  Senior  Battlestaff 
Decision  Aid  Development  Proposal .  1963. 


r56T. 


Par  Technology.  Interim  Technical  Report;  Senior  Battle 
Staff  Decision  Aids,  Taslc  li  Planning.  Contract  # 
f36502-83-c-0154.  “niarcin584. 

Rome  Air  Development  Center,  Statment  of  Work  for  Senior 
Battle  Staff  Decision  Aids.  PR  No.  B-3-36I13.  2~Dec  1952”^ 

Betac  Corporation,  Enemy  Sortie  ^p^ility  yasurement  Aid: 
Design  Plan  md  Functioyl  Description  “Senior  Battle  Staff 
Decision  Aids" .  CDRL  A003  and  AOOS.  Sept  1985. 

Par  Technology.  Enemy  Course  of  Action  Evaluation  Aid, 
Functional  Description  “and  ysion  Plan,  Syior  Battle  Staff 
Decision~Mds.  CDRL  a003  and  aoOS  .  July  1984 . 

Par  Technology.  Enemy  Course  of  Action  Evaluation  Aid  Final 
Functional  Description  and  Pesi^  PlanT  CDRL  A003  and  AOOS. 
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SOmARE  giALITY  MEASUBEHEMT  DEMOMSTRATIGN  PBCXJECT 


As  part  of  the  evaluation  of  the  Specification  of  Software  Quality 
Attributes  methodology,  SAIC  is  following  tRe  procedures  outlined  to 
measure  the  software  quality  of  two  decision  aid  systems.  These 
programs  are  the  Enemy  Sortie  Capability  Measurement  Aid,  and  the  Enemy 
Course  of  Action  Evaluation  Aid.  Both  aids  are  part  of  the  Senior 
Battlestaff  Decision  Aid  project. 

Because  of  your  involvement  and  experience  with  these  aids,  we  are 
asking  you  to  fill  out  some  forms  and  return  them  to  us.  These  forms 
will  be  used  to  determine  what  quality  goals  should  be  specified  for 
each  major  function  of  each  decision  aid.  They  will  also  be  used  to 
evaluate  the  aciequacy  of  this  method  of  quality  goal  specification. 

In  addition  to  the  goal  specification  forms,  we  will  ask  you  to  fill  out 
a  form  uvinq  us  your  response  to  this  method  of  goal  specification. 
Please  x-'-ep  track  of  rhe  time  you  spend  on  this  complete  task  so  that 
you  may  ■''ompletely  fill  in  ':he  last  feedback  form. 

Figure  1  presents  the  13  scttware  quality  factors  we  are  concerned  with, 
^.txony  definitions  of  each.  Ti  ai  :  in  understanding,  these  factors 
a’e  grouped  under  major  concerns  'performance  of  the  system,  how  it  is 
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The  last  form  we  will  ask  you  to  complete  is  Figure  2.  This  form  gives 
us  information  edxDut  you  and  your  reactions  to  this  survey. 


• 


acquisition 

CONCERN 

CRITERION 

DEPINTTION 

PERFORMANCE 

ACCUtACY 

THOSt  CHAlLACreWSmCS  OP  i  TWaIU  which  provide  the  E£QCIR£0  pwcsion  in 
calculations  aNO  outputs 

anomaly  MAHAcaaatT 

THOai  CHAlUCTBIUSrrCS  op  SOPTWaJU  which  PtOVlDe  POR  CONTlKUrTY  OP  OPEJU. 

T10N9  UMMIt  AND  EBCOVISY  FROM  NON  NOMINAL  CCNOmONS. 

ALTONOMY 

THOSE  CHaRaCTERJSTTCS  OP  SOPTWaAE  WHICH  DETERMINE  ITS  SON-DEPENDENCY 

ON  INTERFACES  AND  Pt'NCTIONS. 

DUTMBLTCDNUa 

THOSB  OUAACmiBnCS  or  SOFTWAM  WHm  DBTIUMIN*  TH*  DBORBB  TO  WHICH  SOFT- 
WAAE  TONCnONS  A»t  CKJCBAfWCALLY  0*  UX3ICAUY  SEPAlU-mD  WtTHW  THE  SYSTEM 

EfFECnVENEM-COMM 

Those  characteristics  op  the  software  which  Ptovnoe  po«  mintml'm  ltt..i2ation 

OP  communications  resources  in  PERFORMING  FUNCTIONS 

THOSB  CHAkACraUSna  OFTHB  SOFTWARI  WWCH  mOVTDB  EOA  MINMVM  imUZATlON 

iSCSO 

OP  FROCEJStNO  RESOURCES  IN  PERFORMDHI  FUNCTIONS. 

EfFb.  IVENEM-SlOHAGE 

THOSE  characteristics  OP  THE  SOFTWARE  WHICH  PROMOE  FOR  MINIMUM  UDLIZAnON 

OF  STORAGE  RESOURCES. 

OPCJlABfUTT 

THOSB  CKABACTEBBITCSOf  SOFTWAU  WWKH  OtTBBMM  OKKATTONS  A-VO  f>«0CEDlH£S 
CQNCBIW8D  wm«  OmtATWH  OFSOFTWaBB  AND  WHICH  FBOVm*  USBJLL  lyPlTS  AND 
OLTPLTrs  WHICH  CAN  BB  ASSCMIUTID. 

RBCONFlGUTlAfllUTY 

THOSB  CKAAaCTBWSTICS  OF  SOFTWAM  WHICH  FHOVIDB  WNl  CONTTNLTTY  OP  SYSTtM 
OPEAATION  WHEN  ONE  0«  MOM  PHOCESSOIlS.  STOBACB  L"NrT5.  0«  COMML’NICaTION 

LINKS  FAIU 

SYSTIM  ACCBSSUILmr 

THOSE  CHARACTERlSnCS  OP  SOPTWaM  WWCM  PtOVlDC  FOt  CONTEOL  AMD  AODCT  C# 

access  to  the  sorrwARE  and  Data. 

TTUUMNO 

THOSE  characteristics  OP  SOFTWARE  WHICH  FEOVIDt  TEANSmCN  PROM  CURRENT 
operation  ASD  PROVTOE  initial  FAMlUARiZAnOH. 

DESIGN 

COMn-ETENCa 

THOSB  CHAlACmtSm  OF  lOrrWAU  WtfflCM  PWJVtDB  MX  MBUMDrTATMN  OFTHB 

niNCTKMS  REQOlREa 

CONSUTByCY 

THOSE  characteristics  OP  SOFTWARE  WHICH  PROVIDE  FOR  UNIFORM  DESIGN  AND  W- 
PUEMEin'ATION  TO  THE  REQU'IREMENTS  wtTH  RESPECT  TO  THE  SPECIFIED  DEVELOPMENT 

'nUGBABIUTY 

THOSB  CHARACrtBiSTK3  OF  SOFTWAM  WHIOI FBOVTOI A  THUADOF  QUOIN  ntOM  THB 
GMFLEMBNTAnON  TOTHB  taQUniMDm  WITH  KBSFBCTTO-DIBSFtaFnODBVlLOFMeNT 
BNVBLOFB  and  OFBKATIONaL  {NVHtCHMBrr. 

vlSOILfTY 

THOSB  CHAAACTBRISTICS  OF  SOFTWAM  WHICH  FB0V1DB  STaTVS  MONTTOBINO  OF  THl 
OBvaOFMBNT  AND  OFULaTTON. 

AmiCATiaN 

THOSB  CHAMCTBMSTK3  OF  SOFTWAM  WraCM  DBtnMINB  ns  NONDIRNUNCY  OM 

adaptation 

(ND8PCM»fC9 

MTABMBSYlTEMMKKOCOOBCOMFOTBBAKCNrTICTUULANDAtOOWTKMS. 

AtGMEKTABIUTY 

THOSE  CHARACTERISTICS  OP  SOFTWARE  WHICH  PtOVIDt  POft  UNIFORM  DESIGN  AND 

cmplbmiktation  techniques  and  notadon. 

COMMONAUTY 

THOSB  CHAKACmniincS  of  SOFTWaM  WHICB  FBOVTDB  fob  THB  USBOFarnMACB 
STANDABDB  FOB  FBOTOCOU.  ROtTINM  AND  DATA  MFMSWTATBM 

(XXUMENT  accessibility 

THOSE  characteristics  OP  SCFTWaU  WHICH  PROVIDE  POR  EASY  ACCESS  TO  SOPTWaRE 
AND  SeUCTtVE  USE  OP  ITS  COMPONENTS. 

FUNCTIONAL  OVEIU^P 

THOSB  CHABACTIX1STK3  OF  SOFTWAM  WHICH  FKOVTDB  COMMOH  KHCnONS  TO  BOTH 
SYSTBUa. 

FUNCnONAL  SCOW 

THOSt  CKARACTERISTKS  OP  SOFTWARE  WHICH  PROVIDE  COMMONALITY  OF  FUNCTIONS 
AMONG  applications. 

CSKUAUTY 

THOM  OMUCTBSiniCS  OF  SOFTWAM  WBKB  ROVTDC IBIAOTM  TO  TW IVIKTICM 
FBHraMBB  wnH  BBFBCT  TO  THB  AFFUCAUCH. 

iNDireNDINCS 

THOSt  CHARACTERirnCS  OP  SOFTWARE  WWCH  DETERMINE  ITS  NON-DEFENT>eNCY  ON 
SOFTWAU  INVIRONMtNT  (COMPLTTNO  SYTTIK  OPERaTINC  SYSTEM  UTUmEl  INPUT 
OUTPVT  ROUTWIS,  UBRARIBS). 

3YTTIMCLAIUTT 

THOSB  GHAHACTIBirTKS  OF  SOFTWAM  wnCM  FHOVIDI  FOR  CLBAB  DBOUmON  OF 
FBOORAMmOCIVU  W  A  NOH.Cai»lBX  AND  UNDUSTANDABU  MAMHn. 

SYSTW  COMPaTUOITY 

THOSt  CHARACTERISTICS  OP  SOFTWAU  WHICH  PROVIDE  THl  HARDWaRI.  SOFTWaU  AND 
COMMUNICATION  COMPATIEftnY  OF  IWO  SYSTEMS. 

vKHUOirr 

THOBB  CKUtACTBUSTTCS  OF  SOFTWAM  WmCH  FOSINr  A  SYST1M  THAT  OOtS  NOT  M. 
qUDIB  USIB  BNDWUDOa  OF  IHB  nmiOU.  UXHCAl.  OB  TOFOUX3ICAL  CHABACTBBUTICS 

MoouAitmr 

THOSE  CHARACTERISTICS  OP  SOFTWaU  WWCH  PROVIDE  A  STRUCTU’Rg  Of  HIGHLY  CO- 
HESIVI  COMPONENTl  WTTH  OFTIMUM  COUPUNa 

GENERAL 

MU-OOCMPTtVINBt 

THOSB  CMAIACmumCS  OF  SOFTWAM  WHICM  ROVIDI  EXFLANATKBr  OFTHB IMH-B. 
MBNTATKUOF  FWrCTTQNl. 

SIMPUCTTY 

THOSt  CHARACTERiSnCS  OP  software  WHICH  PROVIOi  FOR  OEPtNITlON  AND  IMPUEMENT 
aTION  op  FUNCTIONS  IK  IHt  MOST  NONCOMIUX  AND  L'NDERSTANDaBLE  MANN  EIL 

Table  3.  Software-Oriented  Criteria  Definitions 
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ESCMA  FUNCTIONS 


Usability:  Operability 


SCr.',-  i  UNC  lii)NS 


ESCMA 


RcutatNlilv-  Modularity 


FACTOR  ANU 
ASSOCIATED 
CRITERIA 


•  OfIKSHECT 


